I’m going to say something controversial, but I’m pretty sure it is true.

Pursuing a development career or hobby makes people lose their minds.  The kinds of thought processes so prized in the community are detrimental not only to the world at large, but often to the developer himself(or herself.)

Forcing people to classify everything in the world, and often to classify it as good or bad, is NOT a good way to look at the world.  Seeing everything in processes is a very simplified way to look at our subtle and nuanced existence, and despite your opinion of your brain’s power, you cannot possibly “model” the complexity of most things in your mind and believe it to be fully accurate.

Once developers get into this mindset, they start doing insane things like throwing out their furniture because they only sit on it 10 percent of the time, or eat out of a frying pan so they’ll have to wash one less dish.  Eventually they will throw out their dishes because they only use them a certain percentage of the time.  The world is not about efficiency, its about experience.  Don’t let your passion for efficiency and procedures keep you from enjoying the beauty this world has to offer.  Stay human, please…even if it means you’re not going to be the perfect dev.

Bookmark and Share

I’m planning to give an “Intro to Microsoft SQL Server” class for a friend who is interested in learning SQL server for work.  I’d be covering basic administration via SSMS and SQL, commonly used SQL like INSERT/UPDATE/DELETE with JOINS, and maybe some basic performance tuning/evaluation/troubleshooting via DMVs, execution plans, logs, etc.  It might take about 2-3 hours, and would only be useful for those with little to no experience.  Would anyone be interested in coming along?  It would be nice to show a few people while he’s learning, and would provide more people for asking questions, talking about confusing parts of my lesson, etc.

If you’re interested, or know someone who is, just email me or have them contact me directly, or leave a comment here.  We haven’t set a date, yet, but it will probably be next weekend.

Bookmark and Share

I’ve got nothing better to do, I am going to be spending the next few months working on completing some certifications.  I figure I’ve got about 60-70% of the knowledge I need via real world experience, and I’ll really only have to cover things like hardware requirements, Service Broker, endpoints, XML, etc.

I’ve mapped out a quick plan to achieve my goals of MCITPs for Database Administration and Business Intelligence in both SQL Server 2005 and 2008.

MCTS – SQL Server 2005 – Exam 70-431

The first step on my path, and my introduction to many of the technologies that I’m weak with.  This one will take me a few weeks of study and practice.

MCITP – Database Administrator – Exam 70-443 (database design)

This one should be pretty easy.  Designing DBs is what I do, and I don’t think I’ll have much of an issue here.

MCITP – Database Administrator – Exam 70-444 (optimizing and maintaining)

 Basic database administration here, simple stuff.

MCITP – Database Administrator 2008 – Exam 70-453

 The upgrade to 2008, I’ll likely spend more time on this one, since it will cover the new technologies in SQL 2008 that I haven’t had much exposure to in the real world.  I’ll be forced to learn things like geospatial and temporal data handling, filestreams, and other bleeding edge technology.  I expect this to be one of the more enjoyable studies, but it may take me 2 months to get a hold of the material.

MCTS – Business Intelligence  – Exam 70-445

 This one should be difficult, as I am quite well versed in reporting services, but have almost no experience with SSIS and Microsoft’s ETL.

MCITP – Business Intelligence – Exam 70-446

 More of the same here, lots of new technology to learn with respect to Data Warehousing and Data Mining.  I don’t even know how much MDX I’ll have to learn to pass this one.  

MCITP – Business Intelligence Developer – Exam 70-455

 Since I’ve been quite interested in the capabilities of SQL 2008’s data warehousing, this one should be a really fun study.

MCAS Excel 2007 – Exam 77-602

The cherry on top of my certification stack, this will give me the basic Excel chops to extend my BI work on the desktop.

Once these are complete, we’ll have a look at the MySQL CMA exam, and possibly the Database Developer MCITP track.  Someday, maybe even the $20,000 3 week Master Certification for SQL Server 2008.  Has anyone looked deeply into this certification?  I’d be very, very interested in talking with you!

Bookmark and Share

I’m competing in Rails Rumble 2008 this year with my PBG peeps and Nate Ostgard. Its a 48 hour competition to write a web application using Ruby on Rails! Check out our live blog here, and the final product once it is done!  To get an idea of what we’re working on, check out some preliminary text from our about page (first draft hehe)

 

iheartgames is a new way to schedule console games with your friends and family. If you’re tired of playing with strangers, or not having enough people online to get a great multiplayer experience, then iheartgames has you covered.

 

iheartgames lets you add your list of games on any console from our complete and updated database of games from metacritic.com. It also manages a fully featured friends list with tons of contact methods, so you know your friends will always know when you want to play.

 

You, or any of your friends, can schedule game times for any game in your list. Notifications (twitter, SMS, etc.) will go out automatically to any of your friends owning the game. They will then be able to reserve their spot in your game until all available spots are filled. With iheartgames, you are guaranteed a full house with people that you trust across all platforms. Pickup games are a thing of the past with iheartgames.

Bookmark and Share

If you’ve got your dates and datetimes stored as UTC in your database, but want to show the local time value on your reports in SQL Reporting Services, there is a simple solution for this!

 

Change the expression for the filed to the code below :

=System.TimeZone.CurrentTimeZone.ToLocalTime(Fields!DateTime.Value)

 

This converts your UTC datetime field to the local time zone value.

Bookmark and Share

Please check out my new RSS feed at http://feedproxy.google.com/DbaWill if you like RSS!

This will help me to see who is reading my blog, if anyone is at all.  I would appreciate it if you could update your RSS if you’re using my feed.

If anyone needs help setting up RSS, let me know in the comments.

Bookmark and Share

In the address bar of the new Google Chrome, you can type about:<item> much like Firefox.  

Does anyone know of a complete list of these about commands for chrome?  A quick google search turns up a few Mozilla lists, and some of them carry over, like about:cache or about:memory, but about:config doesn’t work.

I can’t find a list specific for Chrome anywhere, someone help!

Here are some basics to get you started

 

about: — show version info

about:version — same as about:

about:network — network tools

about:stats — a message, shh! this page is secret!

about:internets — a message, the tubes are clogged!

about:cache — show the content of your cache

about:plugins — show all information about currently installed plugins

about:memory — show memory usage from chrome and others web browsers running!

about:crash — crash the tab! and show the friendly crash image

about:dns — show dns info, like time

about:histograms — histograms resume

 

about:% will crash the browser!!!

Bookmark and Share

At the recent PASS meeting for the Las Vegas Chapter, I gave a quick talk on filtered indexes for SQL Server 2008.  I posted a powerpoint presentation over at slideshare.net, check it out!

Bookmark and Share
I want to find out if I’ve got an empty set or not.  In MS SQL I’d just use IF EXISTS and write a branch for either outcome, but I don’t have that luxury in MySQL.  So, if I want to find out if a set is empty, which one of the following will be faster?

 

SELECT * FROM `table` WHERE (`table`.app_id = 455) LIMIT 1

 

vs

 

SELECT count(*) AS count_all FROM `table` WHERE (`table`.app_id = 455)

 

The answer here is the LIMIT is probably going to be faster, and the key lies in the WHERE clause. 
If you run a count in MySQL on an indexed column, it is smart enough to use the cardinality of that column’s index to report to you the count.  This makes the count method quite inexpensive when we truly want a count, without qualifications.  As soon as we add that WHERE, we need to return every row that matches our criteria.  The trick with LIMIT, though it is doing the same thing, is that it stops after it gets a match.  Once that first row comes back, it’s done.  On an indexed column, this shouldn’t be much of a difference at all, probably imperceptible to the user and certainly close enough even for a timing test…but on an unindexed column, stopping the table scan in the middle, or if you’re lucky enough, the beginning, would be an enormous time savings.

 

I may revisit this with some real numbers, but a friend of mine brought the question to my attention and I thought it was a really great one.  I’m hoping to get some feedback. :)

Bookmark and Share

Full text indexing is an amazing feature. Instead of incredibly costly LIKE comparisons, an index can be created by the RDBMS which contains the full text of the column, or table, in question. For things like e-mail message bodies, biographies or other lengthy descriptions, or any other unstructured text, full text indexing is the only solution that makes sense. Using the LIKE method (for example : SELECT * FROM MyTable WHERE col1 LIKE ‘%MySQL’;) is so amazingly slow that it will bring a server to its knees if used often.

You enable full text indexing on MS SQL using the the SSMS gui(right click on a table and use the Full Text Index menu if full text indexing has been enabled on that database), or through T-SQL. For more detailed instructions, check out this great article on developer.com. Once the full text index is created, MS SQL just seems a whole lot faster when searching the indexed fields provided you’re using the right T-SQL. I’ll assume that my MS SQL guru readers are already pretty familiar with that!

Creating a FTI in MySQL requires a little DDL, seen below :

mysql> CREATE TABLE articles (
-> id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
-> title VARCHAR(200),
-> body TEXT,
-> FULLTEXT (title,body)
-> );

Once you’ve created and populated your MyISAM table (try using INSERT INTO ISAM_table_name (x) SELECT x from innoDB_table_name for a quick migration from another table with text values.) you can do some really cool things with the index, just like in MS SQL.

On the documentation page from MySQL.com you can find instructions for ranking result sets by relevance using the MATCH() function, which compares nicely with the CONTAINSTABLE keyword in MSSQL which returns both the key and the rank of each key in result or table form. You can also do simple boolean searches, just like the CONTAINS and FREETEXT keywords using IN BOOLEAN MODE. Weighted results are supported using the syntax below :

(MATCH (title) AGAINST (‘keywords’ IN BOOLEAN MODE)*10)+

(MATCH (title) AGAINST (‘keywords’ IN BOOLEAN MODE)*50);

Also, much like MS SQL, MySQL has some words that it doesn’t index…

  • Any word that is too short is ignored. The default minimum length of words that are found by full-text searches is four characters.
  • Words in the stopword list are ignored. A stopword is a word such as “the” or “some” that is so common that it is considered to have zero semantic value. There is a built-in stopword list, but it can be overwritten by a user-defined list.

Some methods I haven’t found in MySQL are :

  • Inflectional search – For example, searching for swim but finding swimming, swam, etc.
  • Proximity based search – Tennis within a few words of raquet gaining a much higher ranking than either of the keywords by themselves. Apparently, MySQL doesn’t store the position of the words in the index, so the proximity searching may not be in the next few versions, if it ever comes at all!

If any MySQL readers out there know how to do these in MySQL (obviously almost anything can be done in PHP, etc.)please drop me a line. You’ll receive full credit here and maybe even a nice paypal thank you!

So as we can see, MySQL does full text indexing just like MS SQL for the most part, but there is a catch. A BIG catch. You can only full text index a MyISAM table. In MySQL, there are multiple storage engines!! This means that many of the databases out there are going to have a tough time with full text indexing if they plan to scale out and stay with innoDB, unless we can find a solution. Quoted from Wikipedia, the following are some of the major differences between MyISAM and innoDB :

  1. InnoDB recovers from a crash or other unexpected shutdown by replaying its logs. MyISAM must fully scan and repair or rebuild any indexes or possibly tables which had been updated but not fully flushed to disk. Since the InnoDB approach is approximately fixed time while the MyISAM time grows with the size of the data files, InnoDB offers greater perceived availability and reliability as database sizes grow.
  2. MyISAM relies on the operating system for caching reads and writes to the data rows while InnoDB does this within the engine itself, combining the row caches with the index caches. Dirty (changed) database pages are not immediately sent to the operating system to be written by InnoDB, which can make it substantially faster than MyISAM in some situations.
  3. InnoDB stores data rows physically in primary key order while MyISAM typically stores them mostly in the order in which they are added. This corresponds to the MS SQL Server feature of “Clustered Indexes” and the Oracle feature known as “index organized tables.” When the primary key is selected to match the needs of common queries this can give a substantial performance benefit. For example, customer bank records might be grouped by customer in InnoDB but by transaction date with MyISAM, so InnoDB would likely require fewer disk seeks and less RAM to retrieve and cache a customer account history. On the other hand, inserting data in orders that differ substantially from primary key (PK) order will presumably require that InnoDB do a lot of reordering of data in order to get it into PK order. This places InnoDB at a slight disadvantage in that it does not permit insertion order based table structuring.
  4. InnoDB currently does not provide the compression and terse row formats provided by MyISAM, so both the disk and cache RAM required may be larger. A lower overhead format is available for MySQL 5.0, reducing overhead by about 20% and use of page compression is planned for a future version.
  5. When operating in fully ACID-compliant modes, InnoDB must do a flush to disk at least once per transaction, though it will combine flushes for inserts from multiple connections. For typical hard drives or arrays, this will impose a limit of about 200 update transactions per second. If you require higher transaction rates, disk controllers with write caching and battery backup will be required in order to maintain transactional integrity. InnoDB also offers several modes which reduce this effect, naturally leading to a loss of transactional integrity. MyISAM has none of this overhead, but only because it does not support transactions.

Now that we’ve read through the details, lets get to the point! InnoDB, MyISAM, and other engines allow a flexibility in decision making for administrators and developers. MS SQL has only one storage engine. This brings us back to the legos vs. furniture philosophy seen in the MySQL backups simplified post from last week. Unix is flexible, and Microsoft has a vision. You’ve got to work a little harder to make decisions in the Unix world, but it pays off when your tools match your problems. I encourage everyone interested in the MySQL storage engines and other database internals to check out Understanding MySQL Internals (Understanding)

For those of you not yet familiar with the SQL Server storage engine, check out Delaney’s amazing work on Inside Microsoft (r) SQL Server (tm) 2005: The Storage Engine

My suggestion for this particular case, which requires searching based on a column of RSS text, is that we have both MyISAM and innoDB tables in the same database. This won’t affect our backups, since we use mysqldump and, for the time being, won’t affect our performance too negatively. Since we’ll only be storing the text data in the RSS text table, we don’t need to worry too much about foreign key issues and other things that apparently affect MyISAM. A simple search and JOIN will solve all of our full text needs.

You can create a table using any MySQL compatible engine using the following syntax :

CREATE TABLE t (i INT) ENGINE = MYISAM;

or

CREATE TABLE t (i INT) ENGINE = INNODB;

This is a really cool feature, and allows you to mix and match technologies for storage inside a single database, which is something quite different from MS SQL. Also, you could get really crazy and create an entirely new database for your ISAM needs and use fully qualified data element names like SELECT * FROM databasename.tablename. Most people don’t recommend this.

More details to come with the final implementation, and look for an update to the backup post soon, as we document the FTP process for offsite backups!

Bookmark and Share