Experimenting

There are times when you have to take at face value what you are told.

There are 1.31 billion people living in China. This according to several sources (that all probably go back to the same official document from the Chinese government.)  I’m willing to believe that number. I’m certainly not going to go to China and start counting heads. For one, I don’t have the time, for another, I might look awfully weird doing so. It’s also accurate enough for any discussions I might have about China. But if I were going to knit caps for every person in China I might want a more accurate number.

That said, sometimes one shouldn’t take facts at face value. A case in point is given below. Let me start out with saying the person who gave me this fact, wasn’t wrong.  At least they’re no more wrong than the person who tells me that the acceleration due to gravity is 9.8m/s².  No, they are at worst inaccurate and more likely imprecise. Acceleration due to gravity here on Earth IS roughly 9.8m/s². But it varies depending where on the surface I am. And if I’m on the Moon it’s a completely different value.

Sometimes it is in fact possible to actually test and often worth it. I work with SQL Server and this very true here. If a DBA tells you with absolute certainty that a specific setting should be set, or a query must be written a specific way or an index rebuilt automatically at certain times, ask why. The worst answer they can give is, “I read it some place.”  (Please note, this is a bit different from saying, “Generally it’s best practice to do X”. Now we’re back to saying 9.8m/s², which is good enough for most things, but may not be good enough if say you want to precisely calibrate a piece of laboratory equipment.)

The best answer is “because I tested it and found that it works best”.

So, last night I had the pleasure of listening to Thomas Grohser speak on the SQL IO engine at local SQL Server User Group meeting. As always it was a great talk. At one point he was talking about backups and various ways to optimize them. He made a comment about setting the maxtransfersize to 4MB being ideal. Now, I’m sure he’d be the first to add the caveat, “it depends”. He also mentioned how much compression can help.

But I was curious and wanted to test it. Fortunately I had access to a database that was approximately 15GB in size. This seemed liked the perfect size with which to test things.

I started with:

backup database TESTDB to disk=’Z:\backups\TESTDB_4MB.BAK’ with maxtransfersize=4194304

This took approximately 470 seconds and had a transfer rate of 31.151 MB/sec.

backup database TESTDB to disk=’Z:\backups\TESTDB_4MB_COMP.BAK’ with maxtransfersize=4194304, compression

This took approximately 237 seconds and a transfer rate of 61.681 MB/sec.

This is almost twice as fast.  While we’re chewing up a few more CPU cycles, we’re writing a lot less data.  So this makes a lot of sense. And of course now I can fit more backups on my disk. So compression is a nice win.

But what about the maxtransfersize?

backup database TESTDB to disk=’Z:\backups\TESTDB.BAK’

This took approximately 515 seconds and a transfer rate of 28.410 MB/sec. So far, it looks like changing the maxtransfersize does help a bit (about 8%) over the default.

backup database TESTDB to disk=’Z:\backups\TESTDB_comp.BAK’ with compression

This took approximately 184 seconds with a transfer rate of 79.651 MB/sec.  This is the fastest of the 4 tests and by a noticeable amount.

Why? I honestly, don’t know. If I was really trying to optimize my backups, most likely I’d run each of these tests 5-10 more times and take an average. This may be an outlier. Or perhaps the 4MB test with compression ran slower than normal.  Or there may be something about the disk setup in this particular case that makes it the fastest method.

The point is, this is something that is easy to setup and test. The entire testing took me about 30 minutes and was done while I was watching tv last night.

So before you simply read something on some blog someplace about “you should do X to SQL Server” take the time to test it. Perhaps it’s a great solution in your case. Perhaps it’s not. Perhaps you can end up finding an even better solution.

 

 

 

 

Advertisements

Documentation

Do it, it’s important.

Ok, I suppose I should expand a bit upon that and in this case add an actual example.

So last night, I again attended the local SQL Server User Group meeting. The talk this month was by Ray Kim and was on Documentation for Techies.  While we all agree that documentation is good, it’s sort of interesting how rare most techs actually do it. Ray’s talk covered some of this and further talked about exactly how valuable it is. In addition, several audience members spoke about how proper documentation saved their company a great deal of money simply by giving their tech support people the ability to answer questions in a far faster form.

I got thinking about some of the clients I’ve worked for and how I’ve wanted to document stuff, but often they have very little actually setup in the way of procedures to handle documentation. This is unfortunate, because it can cost them money. For example, for a client right now I’m working on automating a task.  It turns out that there’s not much documentation, so I’m basically struggling to figure things out as a I go.

One thing you hear tech folks talk about a lot is “oh the code is self-documenting”. And sometimes it is.  Since I work in SQL, often, but not always it’s clear what the code is doing. For example

Select firstname, lastname from Clients where ClientID=@ClientID

probably doesn’t need a comment saying what it does.  It’s pretty clear.  But a more complex query might need some commenting, or it may need some explanation as why a particular approach was taken. For example I was recently writing a stored procedure where the where clause was not quite what one would expect if one were to naively write it in the most obvious manner.  However, the obvious manner would have resulted in a table scan of a very large table. By writing what I did, I could ensure a seek would occur.

I also had a habit, which after thinking about last night and testing today, I’m going to modify a bit. Often I’d write procedures such as:

-- Usage: Exec FOO
-- Author: Greg D. Moore
-- Date: 2016-03-15
-- Version: 1.0
-- This simply returns bar when executed
if OBJECT_ID('foo', 'p') is not null drop procedure foo
go
create procedure foo
as
select 'bar'
go

Now, note technically this is a script (T-SQL) that will drop and then create the procedure, so it’s more than just the script. But it’s useful for me because I can ensure I’m running the latest and greatest and drop the old one if it exists before running it.

But, last not got me thinking. What happens if 3 years down the road someone comes along and needs to edit my code. Let’s say the client didn’t do a good job of keeping track of source code and they have to extract the scripts to create the procedures from SQL Server itself using say SSMS?

The results end up looking much more like this:

USE [Baz]
GO
/****** Object:  StoredProcedure [dbo].[foo]    Script Date: 03/15/2016 10:47:22 ******/
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[foo]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[foo]
GO
USE [Baz]
GO
/****** Object:  StoredProcedure [dbo].[foo]    Script Date: 03/15/2016 10:47:22 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
create procedure [dbo].[foo]
as
select 'bar'
GO

Ignore the extra USE statements and the SSMS generated comments and SET statements. Notice my comments are gone.  This actually makes sense because in the first script, the comments occur before a GO statement so the SQL engine interprets them as completely separate from the statements to create the actual stored proc.  All my useful comments are now history.

BUT, there’s a simple solution. Move the comments to after the first GO statement.

if OBJECT_ID('foo', 'p') is not null drop procedure foo
 
go
 
-- Usage: Exec FOO
-- Author: Greg D. Moore
-- Date: 2016-03-15
-- Version: 1.0
-- This simply returns bar when executed
-- Version: 1.1
-- Comments moved below GO statement
 
create procedure foo
as
 
select 'bar'
go

Now if I use SSMS to generate my script I get:

USE [Baz]
GO

/****** Object: StoredProcedure [dbo].[foo] Script Date: 03/15/2016 10:48:53 ******/
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[foo]’) AND type in (N’P’, N’PC’))
DROP PROCEDURE [dbo].[foo]
GO

USE [Baz]
GO

/****** Object: StoredProcedure [dbo].[foo] Script Date: 03/15/2016 10:48:53 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

— Usage: Exec FOO
— Author: Greg D. Moore
— Date: 2016-03-15
— Version: 1.0
— This simply returns bar when executed
— Version: 1.1
— Comments moved below GO statement

create procedure [dbo].[foo]
as

select ‘bar’

GO

Now my great documentation is preserved. This is a small thing but down the road could save the next developer a lot of trouble.

So, stop and think about not only documentation, but how to make sure it’s preserved and useful in the future.

On Call

I want to pass on a video I’ve finally gotten around to watching:

Dave O’Conner speaks

I’ve managed a number of on-call teams to various levels of success. One point I’d add that makes a difference is good buy-in from above.

He addresses several good points, most of which I would fully agree with and even at various times adopted at my various jobs.

One thing he mentions is availability.  Too often folks claim they need 99.999% uptime. My question has often been “why?” and then followed by, “Are you willing to pay for that?”  Often the why boils down to “umm.. because…” and the paying for it was “no”, at least once they realized the true cost.

I also had a rule that I sometimes used: “If there was no possible response or no response necessary, don’t bother alerting!”.

An example might be traffic flow.  I’ve seen setups where if the traffic exceeds a certain threshold once in say a one hour period (assume monitoring every 5 seconds) a page would go out.  Why? By the time you respond it’s gone and there’s nothing to do.

A far better response is to automate it such that if it happens more than X times in Y minutes, THEN send an alert.

In some cases, simply retrying works.  In the SQL world I’ve seen re-index jobs fail due to locking or other issues.  I like my sleep.  So I set up most of my jobs to retry at least once on failure.

Then, later I’ll review the logs. If I see constant issue of retries I’ll schedule time to fix it.

At one client, we had an issue where a job would randomly fail maybe once a month.  They would page someone about it, who would rerun the job and it would succeed.

I looked at the history and realized simply by putting a delay in of about 5 minutes on a failure and retrying would reduce the number of times someone had to be called from about once a month to once every 3 years or so.  Fifteen minutes of reviewing the problem during a normal 9-5 timeframe and 5 minutes of checking the math and implementing the fix meant the on-call person could get more sleep every month. A real win.

Moral of the story: Not every thing is critical and if it is, handle it as if it is, not as a second thought.

Never run out of a plan

I’ve actually been meaning to blog about this for awhile, but have been putting it off, so here goes.

I’ve mentioned in the past my analogy of “flying the plan”. Lately I’ve been spending a lot of time on a site called Quora. It’s quite a fun site and I’ve learned quite a bit.

But this particular question I think is a great one for life in general.

Scrolling down, you’ll see a post from Jim Mantle. I want to take a quote from his answer:

There have been many air crashes where a problem was being worked by both pilots, neither was flying the aircraft, and they had a Very Bad Day.

If you read about the L1011 Crash you’ll see the real mistake was failing to actually fly the plane. The crew was so engrossed in solving the problem of a burnt-out landing gear light that they missed the fact that the plane was flying into the ground.  A simple burned out bulb and 101 people died.

Compare that to the Miracle on the Hudson where the pilots had a MUCH worse problem (lack of power in either engine) and managed to bring the plane down safely without any loss of life.

He also has good advice that he repeats often “Keep calm”.

I also want to quote Dirk Van Der Walk who later says:

You can run out of height, you can run out of engine, but one thing you can never run out of, is a plan. You must always have a Plan B.

I had a client a few years ago that had called me in to implement a specific change in their infrastructure.  There was also a fairly specific timetable by which it had to be done.

I met with CTO about once a month to go over the status of the project.  At one point it became clear that due to certain corporate policies, it would take about 12 weeks to get to a certain milestone in the project.  Unfortunately the schedule demanded we be there in about 8 weeks.

He asked me what we could do.  I explained I had no control over the corporate policies and that we should start to consider a Plan B.  I’m quite proud that I kept my jaw from hitting the floor when he uttered his next sentence.

This is no plan B and there can’t be a plan B.

This is an example of taking the mantra “Failure is not an option” to a whole new level.

Ironically I was there about a month later when the CTO was basically called out on the carpet for the status of the project and when it was clear he had no plan B, the corporate folks spent the next 24 hours designing a plan B.

In part this wasn’t too hard because the internal people on the project had already had several plan B’s in their mind.

It was only because others did have a plan B that we were able to save any real semblance of the original goal.

Moral of the story: always have a backup plan.  And start thinking about a backup plan to the backup plan.

Getting the right answer by suggesting the wrong one

I’m a participant on a CMC called Lily It is based out of my alma mater, RPI.  At some point, someone created a rule (which I’ve seen elsewhere so it’s hardly unique) that sometimes the fastest way to get the right answer to a question is to post the wrong answer.

There is truth to that.  I think in part it can be summed up with this XKCD cartoon.  Many of us who are involved in technology seem to have an incessant need to be “right”.  So when we see something wrong, we’re compelled to correct the mistake.

But, to be wrong, it has to be clearly wrong.  To go back to my cave rescue experience, if I recommend a 3:1 haul system and you recommend a 2:1, neither of us is necessarily wrong. We might be optimizing for different factors.  On the other hand, if you recommend we use 11mm rope for the haul line and I whip out some clothesline I’ve had in my car for a few years and suggest it should be good enough, after all it’s only Bill we’re rescuing, I’m clearly going to be wrong and need to be corrected.

These thoughts about being wrong and trying to find the right answer were prompted by a coding problem that has consumed far too much of my time. I finally came up with an answer that worked, but not one that I liked.

Essentially I’m building a Combobox (loading it from a datatable) in vb.net

It has key,value pairs, let’s call them (“Test1”, “A”), (“Test2”, “B”) and so forth.
(note VB.net appears to call these a DisplayMember,ValueMember pair and they can be loaded with a dictionary type, so in my mind it’s what they call the “valuemember” is what I’d consider the lookup key and that illustrate my misunderstanding of the issue.)

However, once I load the record in question, I want the selected value in the dropdown to reflect the value in the record (which of course is stored as “A” or “B” etc.)

There appears to be no way in VB.Net to easily say something like:

cbxResource.SelectedValue = Itemrecord.Value

Then I tried:

cbxResource.SelectedItem = Itemrecord.Item just to see if it would work. It doesn’t.

Googling suggests something like:

cbxResource.SelectedIndex = cbxResource.FindString(Itemrecord.Item)

That does indeed work, if I know the DisplayMember name. But that’s I want to display, not what I store in Itemrecord and as such means I don’t know it.

It strangely seems I can not set the index based on the ValueMember, just the DisplayMember.  To me this is strange since coming from a DB world, it appears the value member would be the key I’d want to look  up to select the Displaymember to be displayed.

I finally settled on a hack.  What if I switched the two?

cbxResources.DisplayMember = “Resource”
cbxResources.ValueMember = “Description”

cbxResources.SelectedIndex = cbxResources.FindStringExact(Itemrecord.Item)

cbxResources.DisplayMember = “Description”
cbxResources.ValueMember = “Resource”

I’m not sure I like this answer. It seems to me it should be far simpler. Or that I’m fundamentally misunderstanding how the control should be setup and used.  But for now it’s the hack that’s going into my code.

So why publish here?  Well either it’s a great work-around and I can save other folks the hours of fruitless searching I experienced, or someone can say, “It’s on the Internet and it’s wrong; I have to correct it!”

I’ll take either answer.

Moral: Sometimes being wrong is the right thing to do.

Newspapers and paradigm shifts

When I was fairly young, I learned a detail about newspaper advertising.  The space on the lower-outside right-hand page was worth more than lower left inside page (i.e. along the fold).

If you think about how folks read and flip threw newspapers, this makes sense.  It’s an area more likely to be seen than others.

With news, there’s the term “above the fold” and “below the fold”  Obviously, you want the big news article on the front page, above the fold where it’s most likely to be seen.

When laying out a newspaper, there is over a century of experience in how to do things.  You don’t jump a front page news article to a page in the middle of the sports section; for the most part, you don’t run box-scores on the front page (unless perhaps it’s an upset at the Super Bowl or something else that will garner eyeballs); you don’t scatter sections of your newspaper across didn’t pages.

Years ago, I was proud to be part of one of the first newspaper web application service providers, “PowerAdz” (which later become PowerOne Media, and then later most of it was bought by TownNews.)

Even back then, I realized much of what was known about newspaper layout was going to have to change. There was no longer a physical fold in the newspaper.  There was a bottom edge to a browser window, and that still meant you needed the important news at the top.  But, how long should it run down the “page”.  How many pixels did the viewer have before the bottom edge of the window?  What was the width of your front page?

You also weren’t limited by a physical size to a page.  Articles could run on as long as readers were willing to scroll.  Or was having a reasonable sized page with links to following pages better?

Much of this is still in flux. And I suspect will continue to be for years to come.  Heck, just the fact that articles can have hyperlinks to other articles, or background information makes news on web pages very different from the traditional print medium.

What reminded me of this today was seeing yet another comment on a CNN fluff piece that was linked off of the front page.  The commentator was complaining that “this is news?”

Someone replied it was under the Entertainment section. Another rebutted “yeah, but it’s on the the front news page.”

That reminded me of these thoughts. What is the front page any more? Even though you can click to different sections of CNN, it’s not like a traditional newspaper where you have physically separate section, each with its own front page.  Now it’s all virtual and a front page is simply as you define it.

I think ultimately we have to let go of our definition of the front page of a news site and accept that links to news, fluff pieces and the like will all end up there.  Sure, there will be sections within the page, but to complain there’s sports, or entertainment, or other non-traditional news links off the front page will be like complaining you don’t have to unscroll the papyrus in the correct direction to read it: a sign of an older time.

Times change, but more importantly the medium changes, even if the message doesn’t.

 

QuiCR’s latest product

I mentioned in my latest post I was working on a new project, one that valued simplicity over complexity.  I can now talk about it a bit more.

I had approached a local cab company about the QuiCR product.  Unfortunately, given who his largest market demographic was, (most of his fares do not have cell phones, let alone smart phones) we decided it wasn’t a great fit.

But, as I mentioned, he had made an off-hand comment about something he would like: namely an ability to allow a smaller demographic of his, the local college students, the ability to send a text to his dispatcher and request a cab.

This is one of those design ideas that’s both deceptively simple and complex at the same time.  It’s simple because “Receive text, display text, allow a response” about describes the problem.

Now, the simplest solution obviously would be a to give the dispatcher a cell phone with texting capabilities.  That would also be the wrong answer.  For one thing, his dispatchers work at a frantic pace and time is off the essence.  While some folks may be able to whip off text messages using “text-speak” in seconds, his staff isn’t among those with fingers that nimble.  

It also doesn’t provide for easy reviewing of messages and threads and the like.

So, the trick was coming up with a computer interface that was simple enough that it could be adopted with only a few minutes of training and that wouldn’t interfere with their current manual dispatch system.

The keyword there is manual. Yes, there are systems out there with all sorts of bells and whistles that can integrate with GPS, credit card systems, IVR and much more.  Those systems also costs a LOT of money.  And in at least one case, a vendor was suggesting that to adopt it, he hire another dispatcher to handle the increased load.  Note the load wasn’t necessarily from increased business, but simply from the complexity of the system.  Now, don’t get me wrong, in a large city where you have dozens of cabs, such a system is the right approach and scales well.  But, it doesn’t scale very well to smaller companies.

His dispatchers use a very manual system.  And it works. Hopefully my new “text-dispatcher” will integrate well with the current system and generate some new business for him.

Sometimes, simpler is better, but harder