среда, 26 сентября 2012 г.

Don't Do Stupid Things When Trying to Help Protect the Environment

Everyone is trying so hard to help the environment. Energy saving there, recycling here. Soooo green...

...and sometimes so stupid. Look at Nokia chargers initiative. Newer chargers will consume less than 30 milliwatts in standby mode which is excellent. And each time a phone finishes charging there's a message saying that charging is complete and you'd better unplug the charger to help protect environment.

This message is brought up as a very clever step that makes every Nokia phone user responsible for the green future....

Well, no. The message about charging completion was always there, just the part about the environment was added at some point.

There's something weird in here, isn't it? It's Nokia charger consuming power in standby but the user has to make extra effort. Why?

The real reason is that those chargers are damn efficient already. Making them more efficient would require better electronic components that are more expensive and that would drive the chargers price up and guess who would have to pay for that? No, people don't like to pay more for greener chargers.

So the pseudosolution is to change the "charging complete" message to contain the word "environment" in it. Does nothing, still puts extra burden on the user but now allows him feel better because of "doing something green". Plus Nokia can now run a PR company full of impressive numbers on energy saving and reducing the environmental effect.


Yes, impressive numbers. To actually estimate anything you have to run the numbers.

Numbers don't lie.

Suppose we have a Nokia charger in standby plugged in for a full year and consuming 30 milliwatts. Just the worst case waste scenario.

Every day the charger will consume 0,03 watts by 86400 seconds (number of seconds in a day) which totals 2592 joules.

Now when you multiply that by the number of days in a year and then by the estimated number of chargers out there the result is very impressive.

Except that the number alone is meaningless. You have to compare it with something and see if it makes any sense.

Let's pretend that each day you will take the stairs instead of using an elevator. Let's assume your weight is 60 kilograms and you go three meters up. This makes you spend at least 3 meters by 60 kilograms by 9,8 meters per second 2 (gravitational acceleration) which totals 1764 joules.

You see, the elevator will not need to use that energy because you took the stairs.

Climb up two floors up and you totally offset the energy saving of the unplugged charger. Plus now you've got some physical exercise and you have better chances of not needing a heart surgery and a ton of drugs which are not that environmentally friendly.

Which will you choose – to do something stupid and protect the environment or to do something useful and protect the environment?

четверг, 13 сентября 2012 г.

Safety Device Useless Because of Poorly Worded Manual

Teh battery terminal covers... Tiny pieces of plastic that could prevent shorting the mobile device battery when the battery is being stored or transported much better than the stupid "don't store the battery in a pocket next to a paper clip" warning.

Now turns out there is one company that ships mobile devices with battery terminal covers. Let's look at Nikon Coolpix AW 100 camera manual. Confirming the Package Contents section clearly shows that the camera battery is shipped with a terminal cover.

One small step for a company, a giant leap for mankind. Sort of.

The problem is that shipping the battery with the cover IS NOT FUKKEN ENOUGH because users are not familiar with what to do with that. Scary paperclip warnings have been there for ages, but users have never seen a terminal cover before. With such background they can only fear a paperclip, not do anything constructive with a terminal cover.

Camera Reference Manual to the rescue? Okay... section For Your Safety says observe the following precautions when handling the battery for use in this product and then there's a bullet list that among other stuff includes this:

[B1] Do not short or disassemble the battery or attempt to remove or break the battery insulation or casing.

Okay, then two bullets later (both completely unrelated to shorting the battery) it says this:

[B2] Replace the terminal cover when transporting the battery. Do not transport or store with metal objects such as necklaces or hairpins.

This is not how the manual should handle this. Here's what's wrong.

First of all, look at [B1] wording. You see, shorting the battery and disassembling the battery are listed as if they are similar actions. They are only similar because they are both Bad Idea™. Other than that they are very different. Disassembling the battery is usually a deliberate action but shorting the battery can be either deliberate or accidental (like accidentally connecting the terminals with a paper clip in a pocket).

These two things should be worded separately. Disassembling or otherwise hacking the battery with a pickaxe should be a separate bullet point.

Next, [B1] that mentions that shorting the battery is Bad Idea™ and [B2] mentioning the terminal cover and the metal objects are separated with two completely unrelated bullets.

This makes [B1] and [B2] unrelated in the reader's mind although they both talk about shorting the battery and how to avoid it.

Finally look at [B2] wording. This on its own deserves careful analysis.

It says the user should put the cover onto the battery while transporting it but doesn't mention battery storage although storing and transporting the battery are totally equivalent in terms of shorting and risks thereof.

Next, [B2] says the user should not transport or store the battery with metal objects. Why the F should he not?

Remember, there's a terminal cover that he should have put onto the battery and that cover should prevent shorting. When the cover is on it's okay to store the battery in a bag full of tiny metal objects – no shorting will happen.

Do you see what's going on? According to the manual the terminal cover is just needed but it doesn't protect against shorting the battery. According to the manual it's just useless or maybe protects from dirt.

So, Dear Nikon, you've implemented the terminal cover, you've produced and shipped it and then you wrote in your reference manual that the said cover is a useless piece of plastic that just needs to be there. Every box with Nikon Coolpix AW 100 now contains a terminal cover and a manual implicitly declaring that the cover is useless.

EPIC FAIL

Here's how you fix it.

1. Remove do not short wording from [B1]. Only leave the wording about disassembling and otherwise deliberately (most of time) messing with the battery in [B1].

2. Move do not short wording into [B2]. Shorting the battery (either deliberately or accidentally) is a separate problem and it should be addressed separately. The terminal cover is implemented to avoid accidentally shorting the battery when the battery is being stored or transported. Remove do not transport or store with metal objects wording, replace it with a phrase saying that a metal object (such as whatever examples you want) can accidentally make a connection between the uncovered terminals and short the battery and to avoid that the user should only store or transport the battery with the terminal cover on.

This makes the terminal cover ACTUALLY USEFUL and saves a billion of cute kittens.

четверг, 23 августа 2012 г.

How a Trivial Technical Problem Turns into a Major Hazard

These days portable devices are EVERYWHERE. Phones, computers, power tools have decent mileage on batteries these days.

EXCELLENT

Now mileage is not always decent ENOUGH and so you'll often want a spare battery or two. This applies to all kinds of devices the number one being Android and number two being power tools.

POWER TOOLS

Unlike other devices most professional power tools come with two batteries. You charge both, then you use one of them until it goes empty. The empty one goes to the charger, the second one goes to the tool. If you have to use the tool where there's nowhere to plug the charger – you may want to have more batteries with you.

Anyway at some moment there's a battery that is not connected to the tool and is likely not connected to the charger. It is on its own.

Have you heard that a battery has TERMINALS? You know, the metal thingies which connect the battery to the tool. Do you have an idea of what happens if there's a direct electrical contact between the terminals?

That's called a SHORT CIRCUIT. This is FUKKEN BAD for teh battery – it will spark, overheat, explode, hell will break loose. Search for "battery explosion" on YouTube for details.

Do companies who produce phones, laptops and tools know about that?

Sure they do. Let's open a manual for ... say... Bosch GDR 10,8 V-LI power impact wrench. Page 14 says this:

When battery pack is not in use, keep it away from other metal objects, like paper clips, coins, keys, nails, screws or other small metal objects, that can make a connection from one terminal to another. Shorting the battery terminals together may cause burns or a fire.

This is how companies typically "solve" the problem. A warning in the manual.

EXCELLENT ARE YOU KIDDING ME?

Suppose I'm packing tools into a bag. I want several screwdrivers, wrenches, pliers, a hammer, some wire and maybe some screws. And finally I want a power tool and a second battery pack for that tool.

You see, if I put the battery pack right into my bag there's risk that the wire or the screws or some combination thereof will get to the terminals of the battery pack and cause a short circuit.

Looks like I'm screwed.

Not really. There is a solution. It is wrapping the battery into plastic film. Most likely that will be a plastic bag.

Yes, a plastic bag. The tool costs several hundred bucks but you have to put its battery into a plastic bag because otherwise it can short circuit and then hell breaks loose.

EXCELLENT ARE YOU KIDDING ME?

Let's looks at it in details. Suppose you want to design a phone, a computer or a power tool. You will have requirements – maximum weight, maximum size, minimum performance, maximum heat dissipation, duration of runtime off the battery and many many others.

You see, these requirements are conflicting big time. If you want a bigger battery you get a bigger heavier device. If you want more power you'll have problems dissipating that power.

Now you spend months and years putting all that together – you choose all the right components and pack them the right way so that your device is powerful enough and small enough and lightweight enough and lasts on one charge enough.

You're so FUKKEN PROUD of yourself. You hire a ton of marketing specialists who market ur device everywhere talking about its sleek design and its amazing power and lots of irrelevant crap like number of screen colors of a phone and voltage of a power tool, but...

... your user buys your device for several hundred dollars and has to pack the second battery into a plastic bag to avoid a short circuit ...

... and all you can do is put a warning into the manual listing all kind of metal and other conductive crap that can cause the short circuit?

EXCELLENT ARE YOU KIDDING ME?

Here's what to do. Next time you design a battery design a protective cover for the terminals.

If you design a power tool – that's very easy. The battery typically connects to the tool handle. Clone the part of the handle the battery connects to, then cut off all the extra plastic which forms the tool body, form the rest into a plastic cap. You've got a plastic cap that fits to the battery in the most secure way possible and protects the terminal against short circuit.

If you design some other kind of device at least provide a tiny plastic box.

Never ship second batteries without a protective cover. Make protective covers available through your service centers in case your users lose covers – sell them dirt cheap.


Now go to your device manual and change the warning to say "dude, hell breaks loose if there's a direct contact between the terminals, to avoid this put the protective cover onto the battery when the battery is not connected to the device or the charger".

THIS IS SO FUKKEN EASY and IT MAKES A REAL DIFFERENCE

Bonus points – you have less lawsuits, easier public relations and maybe you can save a bit on marketing your device. Less risk of short circuit sounds FUKKEN GOOD...

It's a SAFER DEVICE and everyone likes safer devices.

понедельник, 13 августа 2012 г.

Your Order is One Upselling, Do You Want Hatred With That?

If you google for upselling you get more than 6 million hits. It's so popular.

The whole idea is the following. Your business is selling some premium crap and to maximize ur revenues you want to sell tons of that premium crap. One of the techniques is upselling.

You wait till ur customer decides to buy something and then offer him something extra... or a bigger version of what he want to buy... or both. Like the classic McDonalds "do you want fries with that?".

Now the upselling proponents say it increases revenue at almost no cost. The proponents explain it this way...

...some unfortunate dude at McDonalds has already made his mind to get this and that...

...and they're already in front of the counter spelling their order out...

...it takes almost no effort for a clerk to ask "do you want fries with that"?

Like if the dude is an imbecile and kind of forgot to order fries he will spend extra but if he doesn't want fries there's nothing to lose here.

Kind of a no-brainer. Just ask everyone "do you want X with that?" That's easy, isn't it?

Now dear upselling proponents...

STOP RIGHT HERE AND THINK

This is just annoying. ANNOYING. Go find an online dictionary and look this word up.

Ur customer wanted a coffee. "Do you want ice-cream with that?" No, F off!!!

Ur customer has already ordered more than a person can eat per day. "Do you want fries with that?" No, F off!!!

Ur customer orders a single cheeseburger. "Do you want cola with that?" No, F it, ur customer is not that stupid to not know better if he wanted cola when he came to the counter.

Do you really think that a person will go to the counter to order a single cheeseburger and not order cola if he really wants one? Really? Is he that stupid?

Yes, sometimes upselling may be a good idea, but don't just try to upsell anything to anyone.

Otherwise it's ANNOYING and now that you've looked up this word there's one more for you -

...that second word is HATRED. When you do something ANNOYING over and over again that triggers HATRED...

...and that's not some generic untargeted HATRED, but HATRED towards ur business.

Do you want customers to experience HATRED towards ur business? Do you want to spend a fortune on advertising and then trigger HATRED by clueless use of upselling?

The proponents of upselling measure the revenues and then conclude that using "do you want X with that" upselling leads to some percents of revenues increase.

Excellent.

Now did they measure the extra time people have to wait in the line behind the dude who is being upsold to?

It's that easy. You're the fourth in line at McDonalds. The dude at the counter orders a single cheeseburger and the clerk tries to upsell a cola to him, the dude refuses. You have to wait extra five seconds in the F***ng line. Now the next in line dude comes to the counter and the clerk tries to upsell to him. And then the next dude. And then it's ur turn to waste time on upselling.

How do you feel about that? You've just wasted 20 seconds on upselling. Excellent investment. Come tomorrow for more.

Did the upselling proponents measure this waste of time and HATRED it induces?

Do you still think upselling has almost no cost? Do you want upselling with that?

пятница, 27 июля 2012 г.

Raspberry Pi is NOT a credit card sized computer

Raspberry Pi... A small rather cheap and rather powerful computer. No problems with that.

What about marketing?

You see, it is marketed as being a credit card sized computer. Well, let's see...

A credit card size is 85.60 × 53.98 mm and Raspberry Pi size is... 85.60 × 53.98 mm

EXCELLENT!!!

Yet a credit card is 0.76 millimeters thick (there's an international standard for that) but Raspberry Pi is 17 millimeters thick.

Now grab a calculator and ensure that 17 millimeters is something like 22,37 times more than credit card permitted thickness. This kind of means that you need at least 22 credit cards to occupy the volume Raspberry Pi occupies (and to make any credit counselor seriously worried).

Don't believe that? Okay, follow this plan: get Raspberry Pi, get to the nearest ATM and try to insert Raspberry Pi into ATM card slot.

IT WON'T FIT IN THERE

It's not credit card sized, it's size of a cigarettes pack, but that's not so good for marketing.

This also explains why "electric imp" module being packed into an SD card is so cool. Although you likely won't really need to use it in the form of an SD card you can be damn sure that it fits into SD card dimensions and you don't need 7,98 times more height than an SD card would occupy.

понедельник, 19 марта 2012 г.

Static Analysis is Not to be Offered Through Fear

Recently I was wasting time watching stuff on YouTube and at some moment I thought that static analysis tool vendors might have some marketing materials there.

Turns out, they do. Here's a neat "BUY IT OR UR SCREWED" video from Coverity dudes:

WOW! SCARY!

The irony is I'm rather old and I have very good memory. I watched the above video and I got sure I've seen something like that before.

But where?

Who might use these "use it or ur screwed" tactics so persistently that it got imprinted in my brain and I now remember it?

Here they are: the toilet cleaning chemicals.


WOW UR TOILET IS FULL OF GERMS USE OUR CHEMICAL OR UR SCREWED

What can I say?

Dear Coverity dudes! It's gonna be okay, just don't lick ur toilet.

понедельник, 27 февраля 2012 г.

How to Not Present Static Analysis Results

Aside of sad fact that it's impossible to try Coverity without first getting the Seal Of Coverity Sales Force Approval the next big difference between Coverity and PVS-Studio anyone can see is...

TEH MARKETING MATERIALS

How This Could be Done

Let's look at a typical PVS-Studio scan report. This one happened to get right in my way and so I link to it here.

A misprint... Yes, I see. Risk of array overrun... I see. Several more subtle defects.. I see. No freaking idea how those defects affect the program functioning, but they are presented quite well and are easily assessible by anyone who is willing to pay attention to them.

One might wonder how the program could run with those horrible defects.

This is quite simple. Defects that actually manifest themselves have been identified earlier using other methods – plain old debugging, unit tests, peer review, whatever else. All the rest requires some effort to get exposed. That might be some unusual dataset. That might be some unusual sequence of user actions. That might be some unusual error indication. That can be upgrading a compiler or a C++ runtime.

Defects are defects. You can't lie to the compiler they say, but not all defects are created equal. Those subtle things will sit in the codebase for years and then all of a sudden someone runs PVS-Studio on the codebase and WOW WHAT A HANDFUL OF HORRIBBLE BUGS ZOMG ELEVENELEVEN!!! they will think.

So a scan report alone is worth nothing – it still takes a developer who is familiar with the codebase to assess and possibly address each reported defect. PVS-Studio scan report does exactly the right thing – it presents defects one by one together with some analysis, nothing more.

How This Should Not Be Done

Now look at Coverity marketing materials. You will have a very hard time to find a scan report like the one linked to above with Coverity scan results. Yet once in a while Coverity dudes will issue an Integrity Report.

An Integrity Report is a very enthusiastic document containing such words as mission, seamlessly and focused on innovation. Not bad as a starter – at least presence of those keywords identifies clearly that there's too much marketing in the first three pages.

Moving on to Table A... Oh, this table shows a distribution of project sizes. Using the word distribution somehow implies that the data gathered has some statistical significance and so deserves extra trust. Well, with 45 projects total trying to build a chart and call it a distribution is very silly. You see, they had TWO projects with more that 7 million lines of code. That's unbelievable, I'm breathless.

All the rest of the Report is also full of similar completely meaningless tables. Yes, it is cool you found 9,7654 defects per square foot of some project. But until you let me try your program – I don't care, those figures don't matter any more than a 132 percent efficiency claim (the post is five years old, yet still relevant).

Fast forward to Appendix A. Tables 2 and 3 summarize defects but assigning each a category and and impact. Let's see...

Control flow issues. What's that? Is it when I forget to put a "break;" at the end of a "case" in a "switch" statement? So you say it has medium impact... Okay. What about "main()" returning immediately? That's a control flow issue as well and don't tell me it has medium impact. Not all control flow issues are created equal.

Null pointer dereferences have medium impact, don't they? Sure, my code defererences null pointers here and there and each time that happens users get a candy. Perhaps the Report authors meant potential null pointer dereferences which is a situation where code dereferences a pointer without first checking that it is not null. Good news is checking a pointer each time before it is dereferenced clutters code big time. Again, not all null pointer dereferences are created equal.

Error handling issues have medium impact. What is that? Is that checking for error codes of Win32 API functions? Sure, any time a program wants to open a file without validating whether the attempt to do so failed and just proceeds reading it's almost no big deal for the user. No access to the folder? We'll pretend we've saved the file. Whatever. Not all error handling issues are created equal.

Integer handling issues have medium impact. Sure, overflowing an integer while computing a memory allocation size is no big deal. Just allocate whatever amount it happens to be and pretend it's the right size. Not all integer handling issues are created equal.

Insecure data handling has medium impact. What's that? No freaking idea, but I something tells me not all cases of insecure data handling are created equal.

Incorrect expression – medium impact. Sure, misplace braces wherever you want, no big deal.

Concurrent access violations – medium impact. You just spend the rest of your life debugging them, no big deal.

API usage errors – medium impact. Your code erroneously forgets to specify the path and that causes entire contents of Windows\System32 be deleted. No big deal.

Program hangs – medium impact. The program hangs only when being run on a computer outside a Windows NT domain. You run it just fine inside your corporate network, then go to a trade show and it stops working, your laptop turns into a thousand dollar space heater with a screen. No big deal.

Why is no category assigned low impact I wonder? Is it because authors didn't dare call a software defect having low impact just because of belonging to some category?

This doesn't work. You can't throw several thousand defects into several categories and then assign each category an impact level. This is just impossible. If you're a software developer you must realise that beyond the shadow of a doubt, otherwise just quit your job immediately and go to the nearest McDonalds outlet – they have a "help needed" sign waiting for you.

The whole Integrity Report is just a big mess of numbers and diagrams. It's usability is not even zero – it is negative. The report scares the hell out of anyone who is concerned about software quality and stops there.

The Outcome

So what's the difference between PVS-Studio marketing materials and Coverity marketing materials? The former present facts that one can interpret and verify. The latter just try to scare by summarizing and no chances for verification.

Because not everyone deserves a free Coverity trial.

среда, 4 января 2012 г.

PVS-Studio – the Greatest Trial of the 21st Century

The Installation

John Carmack says, Visual C++ developers should try PVS-Studio – painless demo download. Yes, the demo download is indeed painless and clicking through makes the program just install. Well, not exactly – first you have to close all instances of Visual Studio you have running, but then it's a simple click-through. And then you get the reduced functionality.

The Trial Limitations

The functionality is not reduced as in “only the first episode of Quake until you pay, Sir”, it is done in much more clever way – instead the analyzer output is slightly garbled.

While the analyzer scans through code it emits messages like “in file X.cpp on line Y there's this odd thing”. Some messages will look like that, but some will read “file X.cpp [AND I WON'T TELL YOU WHICH LINE – TRIAL RESTRICTION]” which looks kind of silly but is in fact a very good greediness-usability tradeoff.

Later versions added a bonus layer of greediness – messages that do contain the line numbers ungarbled can be double-clicked and that will open the problematic file in the VS IDE editor and scroll right to problematic line. Now with an extra layer of greediness once you double-click you're presented a modal dialog with a progress bar that runs for about 15 seconds and until it finishes you're not allowed to the code.

That's a very cleverly engineered greediness-usability tradeoff. You can't ask 3,5K euros for a piece of software without showing it first and you can't allow a full-blown version to be used right off the shelf without a proof of purchase. With these limitations the program is still mostly usable, but its commercial use is effectively prevented. Think how you run analysis during a daily build and it says you have “a problem in file X.cpp at [DON'T TELL YOU WHICH LINE]” and you need to hire Hercule Poirot just to deduce where the warning belongs to because the file is 3 thousand lines long and you even don't know whether the warning reports an actual problem.

Now we get to actual (trial) use.

The Good

First of all, the amount of really weird subtle stuff the program can find in real code is amazing.

It will look through a mumbo-jumbo of some bitwise “or” of a dozen Win32 API flags and note that SHITTY_FLAG_PROHIBIT_DIRECTORIES is used twice in that bitwise “or”. That's not a kind of a problem a human can reliably find, but software does that just fine.

It will look at some very old code and see that you try to “delete” a smart pointer. Who would “delete” a smart pointer in the first place? Well, your code will, because that was a raw pointer before refactoring and all but one occurrence were edited during refactoring and the problem wouldn't manifest itself because that code actually ran once a year when a runtime error occurred during a full moon and when that unlikely combination of event took place the program would crash nastily but an affected user wouldn't be able to reliably reproduce it and so wouldn't be able to file a useful report but remembered that ur program crashed on him a couple of times.

It will note that you have two enumerations and a switch where the expression to switch on is using members of one enumeration but the values in case labels are from the other enumeration. That code would work for years until you altered one enumeration but failed to alter the other.

It will find numerous other very weird pieces of code that are completely legal C++, but for whatever reason don't make sense and often constitute and error. Look into “General Analysis” section in the online manual – the list is pretty impressive and most of those problems will indeed reside and stay dormant in commercial software that has been shipped for years.

Once analysis is complete it's not uncommon to think “How the F would this program be shipped with that many defects?”

The authors will often present such examples and use them as proof of the tool being essential as they claim is finds crap in code at very low cost. This is a very bold claim and needs careful verification.

Have you noted that this post has been very excited and optimistic so far? Well, let's go to the dark side.

And Then It Goes Wrong

The key to any automated analysis tool is that it should fail as little as possible. This means exactly the following. Suppose the program can detect cases where you have a long bitwise “or” and two components of that “or” are the same.

The program

1. should emit a relevant warning for every occurrence of such case and
2. it should not emit that warning anywhere else.

The truth is PVS-Studio is a program and like every usable program it contains its fare share of bugs. Yes, the program for finding bugs can and often will contain bugs.

A short digression needed here. Compilers also contain bugs (an lost of and they are nasty) and that leads to compilation unexpectedly failing or the emitted program code behaving not conforming to the language Standard. If you don't realize that – get out of the industry and go to a local McDonalds outlet right now – they often have “help needed” signs on display. Digression ends here.

So PVS-Studio contains bugs. Those bugs sometimes lead to warnings not being emitted. Like

1. you debug or review your program for an hour and
2. see that there's a bug caused by a situation PVS-Studio documentation lists a warning for but
3. PVS-Studio will not emit a warning when presented that code.

That's real world where all programs have bugs.

This area (no warning where a warning should be) is quite problematic – to estimate how many warnings PVS-Studio fails to emit one would have to somehow analyze the code himself and that's incredibly time-consuming and sometimes just impossible for any large codebase.

Again, the authors are not to be blamed here. They fix many bugs being reported right away and all users (trial users included) need to report those bugs promptly.

Currently the weakest link is that templates are not fully supported, so warnings are not always emitted for suspicious pieces of code if those pieces are inside a template. This is not a minor problem.

Non-templated code is final – you can gather a dozen senior dev wise owls and together read through the code and conclude that it is okay. Templated code is not final – it is not actual code until you fully parameterize it. And btw templates can be parameterized with other templates. So you can have a five-layer templated apple pie (like a vector (template class) storing some smart pointers (also template class) and using some custom allocator (also template class) with something else templated as well) and some really problematic joint between the layers that could lead to a singularity developing into a black hole – very hard to diagnose with wise owls, this is where an automated tool would be of great help. So not having full templates support is not a minor problem at all.

Also bugs in PVS-Studio will sometimes lead to a warning being emitted where there's no problem even formally – the program will look at something and say “you have function parameter passed by copy” where the parameter is in fact passed by reference or something equally irrelevant. Such bugs are usually fixed by the authors very fast and users should of course report such bugs promptly. This happens quite rarely and is not that of a problem.

And Then No-one Knows Whether It Went Wrong

And finally after “warning not emitted where they should be” and “warning emitted where it definitely shouldn't be” there's a giant grey area where code looks suspicious and it's impossible to say whether it contains a problem without further analysis. For example, 4 is number of bytes in a 32-bit “int” and 32 is number of bits in a 32-bit “int” type. So when you use either of the numbers it might be that you manipulate a 32-bit number byte-wise or bit-wise and then your code is unportable – technically it's impossible to know unless you analyze the surrounding code.

Such grey area warnings are emitted very often for different portability cases (the “Viva64” warnings group). The program definitely can do much better.

For example, there will be a switch with different numbers being returned from different case labels – that can be used for computing some weight coefficients for each element depending on the element's state. Like “for new elements return 1, for partially prepared return 2, for super prepared return 3, etc.” Now if there're numbers 4 and 32 in that sequence – a warning is emitted about “dangerous magic number used”.

Wow, The Mighty Program, surely when numbers 1 through 32 are used in a uniform way only numbers 4 and 32 are dangerous and can be used by Chuck Norris only, but other numbers are not dangerous and can be used by anyone. People only get screwed by abusing 4 and 32 and never by abusing 13765 – Oprah tells that in every show.

This can be improved – the program could identify this and other reasonable use-cases and not emit a warning for them. This is only one example, but there're many of those and they all can be improved. The program contains a really impressive copy-paste detection technology and this technology can be used to detect legitimate cases in the grey area.

To be fair, PVS-Studio has a not that high rate of false warnings – try Visual C++ /analyze that emits a warning separately for each time a header is included into the translation unit – that's what's called “barely usable”. Of course, comparing well to a nearly impossible to use tool doesn't automatically make PVS-Studio brilliant – both have to improve.

And the Evil Warnings Suppression

You'd perhaps object that those grey area warnings are inevitable and that's what warning suppression is for. If you really believe in that – the nearest McDonalds outlet is preparing a “help needed” sign for you right now.

I should have regressed to ALL CAPS here, because it's the single most important thing in the whole post.

Warning suppression is to be used as the last resort only, not as a casual thing. The reason is it damages your code.

Suppose you have an imaginary warning V999 emitted on some line where there's actually no problem and you decide to suppress it. You have to add a “//-V999” comment on that line. Done.

Now you're screwed.

Whenever you put a “//-V999” warning suppression comment onto a line of code that V999 warning is no longer emitted on that line no matter what. Sooo....

Each time you edit that line of code you have to reevaluate whether the same warning is not emitted for some other reason. You have to drop the comment, re-verify the code, likely put the comment back. Good luck if you have more than one warning emitted on the same line. You won't die, but your life is no luxury anymore.

This means that you have to document precisely what the warning was about. Otherwise with the next analyzer upgrade it may happen that the warning is no longer emitted for the original code subpiece (“warning where everything is okay” type fixed in the analyzer), but is now emitted for some other unrelated problem (remember, weird stuff hides in codebases for ages). If you just look and ensure that “okay, only V999 is emitted as before” and place the comment back you effectively suppress another occurrence of the warning.  Good luck with that too. Again you won't die. Or maybe you will.

And of course if there's a line that contain a problematic thing that should be diagnosed with V999 and also a thing that triggered V999 where the code was okay and you suppressed V999 for that line and now the analyzer improved and can detect V999 where it should there still be no warning.

So effectively you have to reevaluate all suppressed warnings after each analyzer upgrade so that you don't miss that billion dollars bug being reported.

That's all for the technical part. The program is indeed very technically advanced – it detects really stupid things in real code where you least expected them – both in junior developers crappy code and in senior developers well-tested and long-shipping code. As Carmack says, you will find bugs.

Finally Computers Do What They Are Good At

It's worth noting that a lot of  defects in almost every real codebase is due to copy-paste. The same operand used twice alongside and operation in a long expression. The same code in both “if” and “else”. Different functions implemented identically. This actually happens in actual code shipping for years.

Detecting such stuff is what computers can be very good at. “Dumb” robots just browse code and find patterns very reliably (unless there's a bug – see above) – not something a human can do at reasonable speed and with reasonable reliability.

This reminds of good old days if Windows 95.

If you're old enough to actually have used floppy disks (rectangular things storing 1.44 decimal megabytes) you might have also heard of ancient people using modems for connecting their computers to the internet (btw back then it was considered right to start the word “internet” with capital “I”). So Windows 95 contained an API and a minimalistic interface for that – you had to enter the ISP modem pool phone number, the username, the password, click “Connect” and wait a bit.

And then... If there was someone talking over the same line (typical if you share the place to live) or if the was no dial tone (typical on rather old phone lines) or if the ISP number was busy (typical on peak load hours) you would be display an error message and the connection would fail. And if a connection was established and then closed for whatever reason then again an error message would be displayed right in your face.

And the minimalistic interface would not make a slightest move to do something. A human would just re-dial, but the program showed the error message and stopped.

This could not last long. Numerous programs replacing that interface and reusing the built-in API emerged. They would (optionally) re-dial if the line was busy, they would (optionally) re-dial if the connection was lost, they would try several numbers in turn (the original UI would only have a place for one number). The world was saved.

Why is this long story here? Just because PVS-Studio is not an example of software designed by a brain-dead person as the one described right above. It uses the computer power for searching and pinpointing crap in code, not for making people feel miserable by displaying endless modal error messages. The technology is finally applied right. That's the most important achievement of the authors.

TL;DR; Do I License It Yet?

Now we go to the business applicability part. The truth is PVS-Studio will not find a gazillion of bugs and that might even disappoint you. Don't get in despair too fast.

Also don't fall prey to the multiplication trick. People will say something like this.

Suppose one bug discovered by static analysis would cost you X money if it got to the customers, so it takes N such bugs for the duration of the license for it to pay off and after that it even makes you more efficient.”

Number X is usually quite high and number N is usually quite low, so licensing a static analysis tool looks like a no-brainer.

Hold on.

Brief Trial is a Failed Trial

Try the tool thoroughly (something like analyze a million lines of code over the course of a month at very slow pace – a small portion like 50K lines of code at a time). Estimate each warning to find whether that warning reports a bug and then estimate what it takes to trigger that bug – the amount of bugs that are indeed ever triggered is very low – something like no more than five bugs for a million lines of code unless your code is really crappy. Most of the bugs reported by the program will not manifest themselves in code that has been shipping for as long as three years to hundreds thousands of users and has been used to process millions of input datasets.
Meahwhile the same codebase would contain a much higher ratio of other problems that automatic analysis will hardly diagnose in ten years from now but that would be triggerable and would cause real problems to real users.

So do you have to license a tool that
1. costs a fortune
2. doesn't find each occurrence of a situation it claims to find
3. reports a lot of warnings that need further analysis
4. requires lots of discipline and very developed technical culture?

You have to decide this for yourself. That's what the trial is for.

There's no silver bullet. It's not like you license the program and now magically your code is free of bugs. No. The tool will sometimes spot some problematic code and someone in your team will have to deal with that – maybe fix the code, maybe suppress the warning, maybe write a bug report to the analyzer authors.

Brushing Teeth Twice a Day Sums up to a lot of Time

An inverse of the multiplication trick is the continuous integration trick. To make the most use of the tool you have to use it at all times, not just twice a year.

The program even contains an “incremental analysis” option that runs analysis of modified files after each compilation so that defects are reported as early as possible. This is just great – the defects no longer slip even into the daily build. This also means they are not counted – how would one file a defect report for a piece of code that was fixed even before being committed to version control? This improvement is great by itself, but it's very hard to count how efficient it is and so it also prevents making a fact-based decision of whether the tool is worth the money.

Of course, continuous integration also means that you have to address all the new warnings reported by the tool every build, not twice a year. That's the other side of continuous.

The deeper the analysis goes the more easy it is to dismiss such an expensive tool. Yes, the tool finds crap really well. It is just not enough data to reliably back the claim that the tool indeed finds crap at very low cost.

The best way to think of static code analysis is to compare it to version control. Version control won't work immediately after you install it – you have to teach people how to use is, when to commit and what to commit and how to describe commits and how to properly tag and branch and merge, and once all that is set up, your workflow improves, but estimating how much more efficient you became is not that easy.

Granted there're high quality version control systems under free licenses – not so for static code analysis.

The Outcome

So once again.

The program is just great and will likely become even better as development progresses.

The same program is not oxygen and neither it is water. You decide whether it is actually useful for your business workflow and whether it is worth time and money.