I recently joined the scalability team at trivago and my first job was to get into the actual MySQL schema to discover which parts can be improved regarding performance and space consumption.
trivago's database consists of about 230 tables, many of them having a history of more than four years, some of them up to 7 years. A few of them do have some historic flaws, like having a signed bigint as primary key (because nobody could imagine what space will be needed in a few years), others do have indexes that are not longer used, etc. There are many tasks to do until you get to know the database scheme very well.
In order to support my work I started to write a litte tool, which was meant to perform some basic tasks on the scheme. The first one was to see what col definitions are much bigger than the actual maximum value it keeps.
What came out of this approach was a small framework-ish script collection that parses the table structures and gives programatic access to it. The framework is designed to use a plugin structure to achieve a simple extensibility.
I'll try to write another post to explain the creation of such a plugin as soon as possible. In the meantime I would like to invite you to have a look at the (yet unfinished and unpolished) code and provide me some feedback.
You can find the code on github: https://github.com/xenji/AnalyzeMySQL