McAfee’s Crisis PR

只简单说几句:

  1. 一定要快,隔离影响,提供解决办法;
  2. 不要在说对不起之前,做大量的解释,这人很愚蠢;
    http://siblog.mcafee.com/support/mcafee-response-on-current-false-positive-issue/
    http://siblog.mcafee.com/support/a-long-day-at-mcafee/
  3. 技术上没任何理由,在赛门“误杀门”之后,白名单还能漏掉系统重要文件;
  4. QA Test Matrix漏掉Windows XP SP3是足以让所有用户震怒的事情;

转载自:http://blogs.zdnet.com/Bott/?p=2031

McAfee admits “inadequate” quality control caused PC meltdown

Update 23-Apr: Late Thursday night, McAfee posted a FAQ on this issue at their web site. The FAQ includes some of the text from the confidential document I received yesterday and is clearly a later version of that document. However, the details of why the problem occurred and the specific steps that the company plans to take to avoid similar problems in the future have been replaced with general statements. I have highlighted the differences in updates below.

As of 6AM Pacific time on 23-Apr, there is still no statement, apology, or clearly labeled link to support resources related to this issue on McAfee’s home page.

If your company uses enterprise security products from McAfee, you probably had a bad day yesterday. If you’re an IT professional at one of those companies, you’re probably still cleaning up the mess caused by a defective virus signature update that disabled systems running Windows XP with the most recent service pack (SP3). The worst part? According to a confidential document from McAfee, the cause was a fundamental breakdown in the most basic of quality-assurance processes.

From an IT perspective, this is a nightmare scenario: an automatic update that wipes out a crucial system file and that can only be repaired manually. I’ve heard from more than a dozen IT pros and consultants over the past 24 hours who shared their experiences. They are, to put it mildly, unhappy.

What went wrong?

That was the question I asked in my post yesterday, and I formally asked a McAfee spokesperson for an explanation this morning. I was told that an answer will be posted on McAfee’s blog later today. As of this writing, that blog post has not been published.

But I found the answer, straight from the source, in a document forwarded to me by an anonymous source. According to my source, the document was “a confidential communication to enterprise customers” sent via e-mail. In it, the anonymous author acknowledges that the screw-up was thoroughly preventable. The document, titled “McAfee FAQ on bad DAT issue,” is written in Q&A format and includes the following exchange:

8. How did this DAT file get through McAfee’s Quality Assurance process?

There are two primary causes for why this DAT file got through our quality processes:

1) Process – Some specific steps of the existing Quality Assurance processes were not followed: Standard Peer Review of the driver was not done, and the Risk Assessment of the driver in question was inadequate. Had it been adequate it would have triggered additional Quality Assurance steps.

2) Product Testing – there was inadequate coverage of Product and Operating System combinations in the test systems used. Specifically, XP SP3 with VSE 8.7 was not included in the test configuration at the time of release.

Update 23-Apr: The details I quoted above have been scrubbed from the FAQ posted at McAfee’s website. The corresponding section of the FAQ now reads as follows: “The DAT release was designed to target the W32/Wecorl.a threat that attacks system executables and memory. The problem arose during the testing process for this solution. We had recently made a change to our QA environment. Unfortunately, this change resulted in a faulty DAT making its way out of our test environment.”

McAfee has also sanitized the portion of the FAQ that describes its plans to adapt its quality control procedures. Here’s the original text of the confidential document sent to enterprise customers:

9. What is McAfee going to do to ensure this does not repeat?

McAfee is currently conducting an exhaustive audit of internal processes associated with DAT creation and Quality Assurance. In the immediate term McAfee will do the following to provide mitigation from false detections:

1) Strict enforcement of rules and processes regarding DAT creation and Quality Assurance.
2) Addition of the missing Operating Systems and Product configurations.
3) Leveraging of cloud based technologies for false remediation.
4) A revision of Risk Assessment criteria is underway.

And here is the corresponding text as it appears in the final FAQ, published overnight:

What is McAfee going to do to prevent this from happening again?

Nearly all of our 7,000 employees have been working around the clock to help customers like you get back to business as usual and to make sure this never happens again. The vast majority of our customers are now back up and running and we remain focused on those that remain affected.

We are implementing additional QA protocols for any releases that directly impact critical system files. We are also rolling out additional capabilities in Artemis that will provide another level of protection against false positives by leveraging an expansive whitelist of critical system files and their associated cryptographic hashes.

That is mind-boggling. For enterprise customers, Windows XP SP3 is probably the most widely used desktop PC configuration. Leaving it out of a test matrix is about as close as one can get to IT malpractice. Any enterprise customer who received this document has every right to be furious.

Meanwhile, McAfee’s website is almost completely silent on the issue. Customers who have been affected by the issue who visit the McAfee U.S. home page see business as usual, with a rotation of large ads trumpeting McAfee’s latest products. More than 24 hours after the problem occurred, only a single front-page link is available, and it’s blandly headlined, “McAfee Response on Current False Positive Issue.” If you go to McAfee’s Enterprise home page, there is no mention of the problem and no link to any support resources. An overseas correspondent sent me a screen shot of McAfee’s UK home page, which also has no mention of the issue.

That link leads to a blog post by McAfee’s Barry McPherson, published yesterday at 4:29PM. McPherson seems more intent on praising McAfee’s researchers and minimizing the problem than helping users. He writes: “We believe that this incident has impacted less than one half of one percent of our enterprise accounts globally…” I find it difficult to believe that the company could come up with an accurate estimate at all, much less do so within hours after the problem was identified. It certainly doesn’t match up with the reports I’m hearing from the field.

Update 23-Apr: Yesterday afternoon, the McAfee blog post was edited to remove this reference. The sentence now reads, ” We believe that this incident has impacted a small percentage of our enterprise accounts globally and a fraction of our consumer base…”

From a crisis management perspective, McAfee’s response has been disastrous. If the company truly cared about its customers, the home page would contain an apology from the CEO and links to detailed support information. Instead, it appears that the company is hoping its customers will just forget about it.

Based on the 100+ comments to McPherson’s post, customers who were hit by this error aren’t likely to forget about it soon. And when they figure out that a lapse in the most basic of quality control steps caused them to spend thousands of dollars in IT manpower and lost productivity, they’re likely to be angrier still.

转载自:https://kc.mcafee.com/corporate/index?page=content&id=KB68787

McAfee DAT 5958 False Positive Error – FAQ

Summary
Please note that investigation of this issue is ongoing, and these FAQs will be updated appropriately as we learn more.

What threat was McAfee trying to detect that resulted in a false positive error?
McAfee added detection for variants of the W32/Wecorl.a threat in DAT file 5958. This detection caused a false positive on the svchost.exe Windows system file. The threat parasitically patches the svchost.exe file by modifying data at the entry point or the entry point itself of the original file, to maintain control on the system. In some instances the patch has been found to be polymorphic in nature. McAfee had observed prior infected versions of svchost.exe files and had detection for this threat. This specific detection was added to target a cluster of infected svchost.exe files gathered through our malware collections, directly associated with samples from the W32/Wecorl.a families.

The false positive occurred as a result of new signatures targeting new variants of the Wecorl family of malware when invoked on the file svchost.exe as a part of the memory scanning process. Details of this threat family can be found here: http://vil.nai.com/vil/content/v_153184.htm. Enhanced drivers in the 5958 DAT were authored to detect some low prevalence variants seen recently.

Why did detection for this threat require an invasive approach for detection and remediation?
To remediate this type of threat, detection is customarily written to kill the infected process, in some instances causing a reboot (a standard Microsoft safety action), and allowing for full remediation of the infected system. This type of remediation is standard implementation to gain access to the file objects that may be locked by the running processes.

Unfortunately this caused removal or attempted removal of legitimate svchost.exe file causing issues from network connectivity loss to rendering systems unstable due to the false positive.

Doesn’t McAfee white list known Windows system files?
Complex and sophisticated malware frequently target Windows system executables and attack their memory space (e.g. via DLL injection). McAfee DATs use Whitelisting techniques to avoid scanning and preventing false positives on Microsoft files in the majority of situations, for example, if this was a simple scan of the file as it was accessed on the file system, a false positive would have been prevented. But because this was a memory scan of the running process that then caused a subsequent scan of the file on disk these mitigation techniques were unfortunately not invoked.

Exactly which versions of Windows operating system and the svchost.exe file were affected?
A subset of systems running Windows XP Service Pack 3 and having specific versions of the svchost.exe file were affected. Svchost.exe files found on Windows 2000, Windows 2003, Windows XP Service Pack 1, Windows XP Service Pack 2, Windows Vista, Windows 7 and older versions of Windows were not affected.

Details of svchost.exe files affected are:

File Size OS File Version Md5
14,336 XPPRO_SP3_x86_v1 5.1.2600.5512 E4 10 EC 73 E2 BE 2A 41 D9 23 B0 06 F5 1C 84 27
14,336 XPPRO_SP3_x86_v2 5.1.2600.5512 27 C6 D0 3B CD B8 CF EB 96 B7 16 F3 D8 BE 3E 18
14,336 XPPRO_SP3_x86_v3 5.1.2600.5512 A7 81 24 26 8A 77 F4 19 02 DB 18 F6 22 AF E6 13

How exactly were user systems impacted?
McAfee corporate customers who have the McAfee VirusScan Enterprise product have reported a variety of symptoms, ranging from a system “blue screen” (not to be confused with BSOD, but due to the issues with Explorer and svchost.exe), loss of network connectivity, inability to use USB, and experiencing a perpetual state of reboot. Users have reported these symptoms when both the file is present on the system (in quarantine), or has been deleted entirely.

Minimal impact has been observed to McAfee’s consumer customers because McAfee rolled back the faulty DAT before the update hit the majority of consumer user systems.

Why was VirusScan Enterprise 8.7 primarily affected as compared to VirusScan Enterprise 8.5?
Because of the different implementation of memory scanning within the products, VirusScan Enterprise 8.7 customers were more broadly affected by the false positive.

How do affected customers restore their systems?
McAfee has enumerated instructions to restore systems to their normal functional state with the right critical Windows files in place. McAfee has also developed a SuperDAT remediation tool to restore the svchost.exe file on affected systems.

Specific instructions are available in the McAfee KnowledgeBase in KB68780.

How did this DAT file get through McAfee’s Quality Assurance process?
The DAT release was designed to target the W32/Wecorl.a threat that attacks system executables and memory. The problem arose during the testing process for this solution. We had recently made a change to our QA environment. Unfortunately, this change resulted in a faulty DAT making its way out of our test environment.

What is McAfee going to do to prevent this from happening again?
Nearly all of our 7,000 employees have been working around the clock to help customers like you get back to business as usual and to make sure this never happens again. The vast majority of our customers are now back up and running and we remain focused on those that remain affected.

We are implementing additional QA protocols for any releases that directly impact critical system files. We are also rolling out additional capabilities in Artemis that will provide another level of protection against false positives by leveraging an expansive whitelist of critical system files and their associated cryptographic hashes.

McAfee recommends customers sign up for our Support Notification Service to get real critical product information via e-mail. Go to the SNS Subscription Preference Center to subscribe. For more information, view the SNS KnowledgeBase article and FAQ: KB67828.