Jul 11, 2016

Big Data Insights: Tale of IT Investments and Returns

Once again, this post brings forth to the audience, a predictive analytical insight from huge volumes of information technology security data belonging to two fortune 500 companies (more or less having similar characteristics). Going to a quick background of the study, here, analytical interest was to know how both organizations understood and invested in their IT Security over a period of time and what was their ROI (Return on Investment)?

With respect to my earlier Big Data Insight post, I got many queries about data, hence, herein, I am publishing data used for plotting purposes, for quick play in R. As, just mentioned above, volumes were huge, and all initial volumes were processed in Apache Spark stack in cloud environment. Now, as usual, below analysis has been carried out using R Programming Language components viz., R-3.3.1, RStudio (favorite IDE), ggplot2 package for plotting.

Now, lets understand the below plot, x-axis has 'year' as measure that ranges from 1999 to 2015, y-axis has numbers observed for major threats and IT Security employees at both the organizations (Org). If one starts looking at the year 2000, it is evident that Org A has more threats than Org B, however, both organizations had their number of IT Security employees around 10 (Org A have only few more employees compared to Org B, also, it is clear that Org B has one more employee than Org A in earlier year 1999). But, Org A for next 2-3 years has increased its IT Security employess to 20 in number, where as Org B has more or less maintained same number of employees for next set of 10 years. As a result, Org B has reached a stage wherein their number of major threats exploded and went beyond existing teams control, whereas, Org A initial invesment in employees worked out better for them and their number of major threats were more or less either stable or decreased over a period of time (don't forget, here acheiving zero is impossible given new technologies, applications coming every year).

Data employed for the plot:
dput(IT_threats_returns)
structure(list(Year = c(1999, 1999, 1999, 1999, 2000, 2000, 2000, 
2000, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002, 2003, 2003, 
2003, 2003, 2004, 2004, 2004, 2004, 2005, 2005, 2005, 2005, 2006, 
2006, 2006, 2006, 2007, 2007, 2007, 2007, 2008, 2008, 2008, 2008, 
2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010, 2011, 2011, 2011, 
2011, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2014, 2014, 
2014, 2014, 2015, 2015, 2015, 2015), Numeric_Value = c(28, 11, 
9, 10, 36, 26, 13, 7, 28, 26, 17, 9, 26, 29, 21, 10, 32, 21, 
19, 9, 25, 34, 19, 10, 30, 35, 20, 10, 22, 27, 19, 10, 31, 42, 
19, 11, 29, 47, 19, 11, 28, 45, 22, 11, 25, 55, 23, 13, 30, 51, 
21, 14, 25, 49, 22, 13, 32, 60, 22, 19, 25, 53, 25, 24, 19, 49, 
25, 29), Desc = c("Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps", 
"Org_A _ No_of_Major_Threats", "Org_B _ No_of_Major_Threats", 
"Org_A _ No_of_IT_Security_Emps", "Org_B _ No_of_IT_Security_Emps"
)), .Names = c("Year", "Numeric_Value", "Desc"), row.names = c(NA, 
68L), class = "data.frame")

# code used for plotting
library(ggplot2)
p <- ggplot(IT_threats_returns, aes(x=Year, y=Numeric_Value, col=Desc)) + geom_line(linetype=5, size=1) + theme_light() + theme(legend.position="none") + ylab("") + xlab("")
p + annotate("text", x=c(2012, 2012, 2004.5, 2012.5), y=c(47,34,18,10.5), label=c("   `Org_B` : No_of_Major_Threats", "   `Org_A` : No_of_Major_Threats", "   `Org_A` : No_of_IT_Security_Emps", "   `Org_B` : No_of_IT_Security_Emps"), col=c("#C77CFF", "#7CAE00", "#F8766D", "#00BFC4"))
Created by Pretty R at inside-R.org

Jul 1, 2016

Indian_IT_Cos_HR_Analytics_Hurdle

There has always been a question to me time-to-time (because of my earlier experience with developing HR platform for few big fortune clients), on “why Indian IT companies are not towards advanced HR analytics?”


Below is true for more than 90% of the Indian IT companies, 'since animal representing management cannot bypass an important layer, a big animal representing employees, which is a pseudo big (*), hence, jumping is almost impossible for implementing all those insights brought out or meant for employees'. Herein, one might guess a missing component which most of employees feel, of not much use in Indian IT companies context …………….?