BY SAMPURNA MAJUMDER · PUBLISHED JANUARY 5, 2017
In present times, when we talk about hiring, recruitment and human resources, we often refer to the term ‘analytics’; to be more specific, we often refer to or come across the term HR analytics. And related to HR analytics are several other terms such as data mining, predictive analytics and so on. Nonetheless, how many of us know the exact meaning of these terms and how many words exist in the domain of HR analytics.
In the following post, we will take a look at the glossary of HR analytics
#1. HR analytics
HR analytics refers to the application of essential data mining and business analytics techniques to talent data. It usually refers to analytics that measures performance and efficiency that matter to HR only.
#2. Predictive analytics
It is a section of advanced analytics used to make extrapolations about anonymous impending events and referred as predictive analytics. In this case, it is about recruiters predicting the likely job candidates for a vacant position.
Predictive analytics implements many techniques including statistics modelling, data mining, artificial intelligence and machine learning to scrutinise existing data and make predictions about the coming event. In recruitment, it allows organisations to become proactive; anticipating behaviours and outcomes based on actual data.
#3. Data mining
Data mining is almost like digging for gold. Just as gold diggers sift through piles of dust and sand in hope to strike a piece of shiny gold, data mining is the method of learning patterns in piles of raw data and turning them into concrete information; which later can be used to make predictions about staffing.
#4. Machine learning
Machine learning is a representation of Artificial Intelligence (AI) that allows computers with the ability to learn without being explicitly programmed. It is mostly achieved through various pattern recognition processes. With the help of machine learning, can start recognizing the pure data points’ of candidate’s information, their work history and their profile.
#5. Descriptive analytics
Descriptive analytics mines historical performance data to look for the reasons behind the past success or failure.
#6. Cost modelling
Cost modelling helps the one in the C-Suite to understand and interpret several HR related expenses. These include recruitment and on-boarding costs, estimated time required for an employee to attain maximum productivity, compensation, employee turnover, and overall productivity costs. Cost modelling can also offer an insightful picture of retention and recruitment plans, even for a stipulated period.
#7. Decision tree
A decision tree is a model that looks like a tree. It comprises decisions and their possible consequences. It is a significant tool to make predictions. A decision tree allows you to predict what might happen by learning from existing data.
Many HR practitioners often use excel. However, most predictive HR analysts use R. It is the most attractive tool for data scientists. R is a free open-source system for statistical visualization and computation. It also enables you to work with massive data sets that would be too huge to handle in Excel.
#9. Structured data vs. unstructured data
There are a two types of data in the HR analytics domain — structured and unstructured. When data is neatly organised into a spreadsheet or database, it is called structured data.
On the other hand, where the data is not properly structured, it is referred to as unstructured data. Its lack of structure makes it time-consuming and tiring to use.
#10. Multivariate analysis
Multivariate analysis is essentially the statistical procedure of simultaneously analysing multiple independent (or predictor) variables with multiple dependent (outcome or criterion) variables using matrix algebra (most multivariate analyses has a correlation).
In human resources when you want to predict how age and engagement levels influence someone’s compensation and performance ratings, there are two dependent variables. This is what is known as multivariate analysis. Take a look at the image below:
#11. Quantitative scissors
Quantitative scissors is a phrase widely used by data scientists to describe a moment when an employee begins to be profitable. Consider the example of 2 lines intersecting. One is a cost line, and another is a benefit line. When the benefit line is higher than the cost line, then the employee becomes an asset to the organisation, not an expense.
When one creates an algorithm, one wants to be as accurate and as predictable as possible. Boosting is an interactive statistical technique, used in the process developing an algorithm that creates multiple extra training data-sets. A model is created for each these data-sets. Since these data sets are created deliberately, it implies that the weight of the mis-classified data points is increased. Therefore the next algorithm will fit these miscalculations much better. This process repeats itself several times. Together these models decide on the most reasonable consequence. They make this choice based on a weighted vote in which more accurate models have more voting power than less specific models.
#13. Random forest
Contrary the boosting technique; the random forest technique randomises the algorithm instead of the data. Usually, a decision tree algorithm selects the best attribute to divide its branches. However, in a random forest technique, this procedure of selecting the best attribute is randomised. It leads to the production of different trees. Hence, a forest and these random trees produce a much better result together.
The technique of pruning is associated with the concept of a decision tree. Pruning is used to reduce the complexity of a decision tree. A decision tree is built by taking the most critical attribute to split its branches, and this process continues till the tree is completed.
#15. Demand and supply forecasting
As the name suggests, demand forecasting means estimating the number of people required with the right skill set to accomplish certain upcoming tasks. Supply forecasting on the other hand implies estimating the number of people who will be available to accomplish a certain future task.
#16. Cost to hire analytics
Needless to say that all external hires come at a cost. Sometimes this cost can be hidden. For example, the cost of sifting through hundreds of resumes to find the perfect candidate or it can be as simple as cost of an advertisement for job vacancy. This is referred to as cost to hire analysis. It makes use of notable hiring data to evaluate and identify the various cost heads; and then suggests means to reduce or contain certain large cost heads.
#17. Joining probability analysis
HR professionals often have to deal with situations such as a candidate accepts an offer but backs out from joining the organization. Joining probability analysis includes building a probability score for all candidates, as an indicator of how likely are they to join. It successfully identifies profiles of candidates who are more likely to join, based on an analysis of their character and traits. A tool like this helps avoid the loss incurred when a candidate backs out from joining after receiving an offer.
#18. Hiring channel mix modeling
HR professionals make use of several channels while hiring candidates. These include employee referrals, recruitment consultancies, social media and so on. Hiring Channel Mix Modeling analyses previous hiring and identifies all channels that have led to hiring for an organization and the interplay between them. It can also go a step further and pinpoint the channels that have been the most fruitful in hiring.
#19. Attrition analytics
In order to understand attrition analytics, first of all we need to understand what is flight risk. Flight risk refers to the risk associated when an employee is looking to switch jobs. The HR must devise methods to retain talent as long as possible. Attrition analytics helps reduce flight risk by adopting a predictive approach and identify problem areas in advance.
Clustering refers to a type of machine learning that makes predictions by crowding data. The following example shows 1,000 data points divided in three clusters. Take a look at the image below. Machine learning makes it possible to make estimations of the different clusters. Additionally, when a new point of data is introduced, the algorithm is able to predict in which cluster it is more likely to belong.
#21.Training data vs test data
When you have a data set, you can definitely choose to develop an algorithm. But how would you know that the predictions you made were accurate? In order to find out about that, you would need another different set of data which is known as test data set.
Over fitting refers to a modeling error which occurs when a function is too closely for into a limited set of data. However, in HR analytics, it is related to machine learning. Machine learning is a complex technique and it can provide very detailed analyses. Since a lot of detailing is involved, it is at a risk of ‘overfitting’. This implies that anyone can create an algorithm that has the ability to predict the data almost perfectly.
Bagging is another form of meta-algorithm that stands for bootstrap aggregation. Bagging refers to a particular technique in which multiple training sets are independently sampled, based upon the original data set. Bagging helps in reducing the effects of outliers in the algorithm, and thus the algorithm’s variance as well. This technique is mostly used for the decision tree model.
C4.5 is a decision tree algorithm as well as a well-known data mining algorithm. With every new branch, C4.5 uses the criterion of information gain versus default gain ratio per attribute and then selects the best feature to riven its branch on.
#25. Linear regression
Linear regression analysis refers to a statistical method, that estimates the relationship between a dependent variable and one or multiple independent variables. Regression analysis implements the least squares methods in order to estimate the best fitting curve on the data. This curve can be used to predict various outcomes.
#26. Data cleaning
Data cleaning is a well-versed topic within HR analytics. It is the process of going through the data, fixing inconsistencies and gathering missing data and preparing it for analysis. Since HR data is oftentimes regarded as ‘dirty’, this system of data cleaning is implemented in order to ensure that the data is perfectly alright.
#27. Business intelligence
Business intelligence refers to, making an effective use of data and information to well-informed and sound business decisions. It comprises various elements such as data analysis, data mining and reporting. Business intelligence in HR can help make the right decisions, and letting go decision that are backed by data.
#28. Workforce analytics
Workforce analytics is a great combination of software and methodology that makes use of statistical models to worker-related data, allowing enterprise leaders to optimize human resource management. Workforce analytics can help leaders to develop and improve recruiting methods; make general and specific hiring decisions and of course keep the best workers within the company.