Skip to main content

Data Analytics with Open Source Tools

A long time data wrangler serving many masters as one must in this role, I have been looking for a book that talked about the real life challenges of the job. I would love some practical advice on how to do my job better without driving myself completely crazy.
I found at least some of that in Philipp K. Janert’s book Data Analytics with Open Source Tools. I am not the right audience for the math in the book and based on my experience translating something that technical to executive management would be extremely challenging if not impossible. Often there are no serious math nerds on the team that understand the concerns of the business well enough to bring their numerical and computations skills to bear on them effectively (i.e. three action items to improve customer engagement by 15% in the next 90 days).
More often than not, it is falls on the rest of us who straddle the technical and business worlds, to divine (or help divine) something of value from the many cesspools of enterprise data. To be successful, we to know how to make the most of what little we have in terms of clean data, repeatable processes, inertia to improving them and a common understanding of data across the enterprise.
In the preface and introduction of his book, Janert advocates using as little statistics as possible, going with the most commonsense way to analyze the data set and get a feel for it just by looking at it. Slice and dice it many ways, run some charts and numbers to see if there is an interesting story buried there somewhere. This is been my approach almost 90% of the time and I was excited to see it endorsed by the author. I have used what the math yielded as a way to prove or disprove my story. While far from perfect, the method has helped point clients in the right direction, remedy issues that would have otherwise gone undiscovered.
Later in the book, the author brings up a very important point. Getting data to be good enough is often feasible but to get it to be truly high quality maybe an impossible task. If the success of a project hinges completely on the data being better than good enough, it may be wiser not to take on the project at all. This is excellent advice that I will remember to pass on to clients who are bent on cleaning the Augean stables in their quest for business intelligence nirvana.
I would definitely refer this book again if my job ever required me to do the math on data instead of analyzing it using the far less rigorous techniques that most shops are content to use. However, I will continue to look for a cookbook for the analyst who has to work within constraints of time, poor data quality and lack of cohesive processes that are the sources of data. Ideally, this book will have case studies, problem scenarios and real-life solutions that folks like myself can relate to and apply on our own jobs.

Comments

Popular posts from this blog

Part Liberated Woman

An expat desi friend and I were discussing what it means to return to India when you have cobbled together a life in a foreign country no matter how flawed and imperfect. We have both spent over a decade outside India and have kids who were born abroad and have spent very little time back home. Returning "home" is something a lot of new immigrants like L and myself think about. We want very much for that to be an option because a full assimilation into our country of domicile is likely never going to happen. L has visited India more often than I have and has a much better pulse on what's going on there. For me the strongest drag force working against my desire to return home is my experience of life as a woman in India. I neither want to live that suffocatingly sheltered existence myself nor subject J to it. The freedom, independence and safety I have had in here in suburban America was not even something I knew I could expect to have in India. I never knew what it felt t...

Under Advisement

Recently a desi dude who is more acquaintance less friend called to check in on me. Those who have read this blog before might know that such calls tend to make me anxious. Depending on how far back we go, there are sets of FAQs that I brace myself to answer. The trick is to be sufficiently evasive without being downright offensive - a fine balancing act given the provocative nature of questions involved. I look at these calls as opportunities for building patience and tolerance both of which I seriously lack. Basically, they are very desirous of finding out how I am doing in my personal and professional life to be sure that they have me correctly categorized and filed for future reference. The major buckets appear to be loser, struggling, average, arrived, superstar and uncategorizable. My goal needless to say, is to be in the last bucket - the unknown, unquantifiable and therefore uninteresting entity. Their aim is to pull me into something more tangible. So anyways, the dude in ques...

Reading Shantaram

I finished listening to Shantaram on audiobook after several weekends of being absorbed in the story. This book had been on my to-read list for a long time and I am glad I chose the audio version of it. It is an extraordinary story teeming with colorful characters and rich detail. As an Indian who is a stranger to Mumbai and Maharashtra in that I have never spent years of my life there. I have to rely on what I know second hand. As a fan Rohinton Mistry's A Fine Balance, where in my mind I imagined the action taking place in Mumbai, this book was a chance for me to know the city through another author even if an Australian.  The author,  Gregory David Roberts comes across as someone who is able to see the soul of India through all that ails it. And in connecting with that soul, he finds some answers to his life's hard questions. India does not save him but it keeps his soul alive and striving. Most of his experiences would be unrelatable to the average person who lives a far m...