Skip to main content

Data Analytics with Open Source Tools

A long time data wrangler serving many masters as one must in this role, I have been looking for a book that talked about the real life challenges of the job. I would love some practical advice on how to do my job better without driving myself completely crazy.
I found at least some of that in Philipp K. Janert’s book Data Analytics with Open Source Tools. I am not the right audience for the math in the book and based on my experience translating something that technical to executive management would be extremely challenging if not impossible. Often there are no serious math nerds on the team that understand the concerns of the business well enough to bring their numerical and computations skills to bear on them effectively (i.e. three action items to improve customer engagement by 15% in the next 90 days).
More often than not, it is falls on the rest of us who straddle the technical and business worlds, to divine (or help divine) something of value from the many cesspools of enterprise data. To be successful, we to know how to make the most of what little we have in terms of clean data, repeatable processes, inertia to improving them and a common understanding of data across the enterprise.
In the preface and introduction of his book, Janert advocates using as little statistics as possible, going with the most commonsense way to analyze the data set and get a feel for it just by looking at it. Slice and dice it many ways, run some charts and numbers to see if there is an interesting story buried there somewhere. This is been my approach almost 90% of the time and I was excited to see it endorsed by the author. I have used what the math yielded as a way to prove or disprove my story. While far from perfect, the method has helped point clients in the right direction, remedy issues that would have otherwise gone undiscovered.
Later in the book, the author brings up a very important point. Getting data to be good enough is often feasible but to get it to be truly high quality maybe an impossible task. If the success of a project hinges completely on the data being better than good enough, it may be wiser not to take on the project at all. This is excellent advice that I will remember to pass on to clients who are bent on cleaning the Augean stables in their quest for business intelligence nirvana.
I would definitely refer this book again if my job ever required me to do the math on data instead of analyzing it using the far less rigorous techniques that most shops are content to use. However, I will continue to look for a cookbook for the analyst who has to work within constraints of time, poor data quality and lack of cohesive processes that are the sources of data. Ideally, this book will have case studies, problem scenarios and real-life solutions that folks like myself can relate to and apply on our own jobs.

Comments

Popular posts from this blog

Part Liberated Woman

An expat desi friend and I were discussing what it means to return to India when you have cobbled together a life in a foreign country no matter how flawed and imperfect. We have both spent over a decade outside India and have kids who were born abroad and have spent very little time back home. Returning "home" is something a lot of new immigrants like L and myself think about. We want very much for that to be an option because a full assimilation into our country of domicile is likely never going to happen. L has visited India more often than I have and has a much better pulse on what's going on there. For me the strongest drag force working against my desire to return home is my experience of life as a woman in India. I neither want to live that suffocatingly sheltered existence myself nor subject J to it. The freedom, independence and safety I have had in here in suburban America was not even something I knew I could expect to have in India. I never knew what it felt t

Cheese Making

I never fail to remind J that there is a time and place for everything. It is possibly the line she will remember me by when I am dead and gone given how frequently she hears it. Instead of having her breakfast she will break into a song and dance number from High School Musical well past eight on Monday morning. She will insist that I watch and applaud the performance instead of screaming at her to finish her milk and cereal. Her sense of occasion is seriously lacking but then so is mine. Consider for example, a person walks into the grocery store with the express purpose of buying detergent because they are fresh out of it and laundry is only half way done. However instead of heading straight for detergent, they wander over to the natural foods aisle and go berserk upon finding goat milk on sale for a dollar a gallon. They at once proceed to stock pile so they can turn it to huge quantities home-made feta cheese. That person would be me. It would not concern me in the least that I ha

Under Advisement

Recently a desi dude who is more acquaintance less friend called to check in on me. Those who have read this blog before might know that such calls tend to make me anxious. Depending on how far back we go, there are sets of FAQs that I brace myself to answer. The trick is to be sufficiently evasive without being downright offensive - a fine balancing act given the provocative nature of questions involved. I look at these calls as opportunities for building patience and tolerance both of which I seriously lack. Basically, they are very desirous of finding out how I am doing in my personal and professional life to be sure that they have me correctly categorized and filed for future reference. The major buckets appear to be loser, struggling, average, arrived, superstar and uncategorizable. My goal needless to say, is to be in the last bucket - the unknown, unquantifiable and therefore uninteresting entity. Their aim is to pull me into something more tangible. So anyways, the dude in ques