Srinath Tummala

Logo

Resume | LinkedIn | GitHub


Hey there! This is Srinath Tummala, a junior data scientist with over an year of hands-on experience in turning data into actionable insights.👨‍💻📊💡

Data Science Portfolio



Advanced Linear Regression Redirect to Repo

A US-based housing company named Surprise Housing has decided to enter the Australian market. The company uses data analytics to purchase houses at a price below their actual values and flip them on at a higher price. For the same purpose, the company has collected a data set from the sale of houses in Australia. The data is provided in the CSV file below. The company is looking at prospective properties to buy to enter the market. You are required to build a regression model using regularisation in order to predict the actual value of the prospective properties and decide whether to invest in them or not. The company wants to know: Which variables are significant in predicting the price of a house, and How well those variables describe the price of a house. Also, determine the optimal value of lambda for ridge and lasso regression.

Steps Involved:

  • Importing modules, Reading the data
  • Analyzing Numerical Features
  • Outlier Treatement
  • Correlation Analysis
  • Missing value treatement
  • Univariate , Bivariate Analysis
  • Data Visualization
  • Encoding Categorical Features
  • Splitting data into Train and Test data
  • Transformation of Target Variable
  • Feature Scaling
  • Primary Feature Selection using RFE
  • Ridge Regression
  • Lasso Regression
  • Comparing model coefficients
  • Model Evaluation
  • Choosing the final model and most significant features.

Linear Regression - Boom Bikes Dataset Redirect to Repo

This Project is based on the rental bikes data set from a company called Boom Bikes.

Work Flow

  • Data Loading and Understanding
  • Data Visualization
  • Data Preparation - Split into test,train and Rescale
  • Data Modelling
  • Residual Analysis, Checking Assumptions of Linear Regression
  • Prediction and Evaluation on the Test Dataset

EDA - Lending Club Case Study Redirect to Repo

This case study is based on Lending Club Dataset. Based on the available data set, we are required to draw insights which help in categorizing a new customer. The objective of this case study is to analyze the data to identify customers who might default the loan.

Table of Contents

  • Data Cleaning
  • Data Standardization
  • Missing value treatment
  • Outliers Check
  • Univariate Analysis and Segmented Univariate Analysis
  • Observations from Univariate and Segmented Univariate Analysis
  • Bivariate or Multivariate Analysis
  • Observations from Bivariate or Multivariate Analysis

Malicious URL Detection Redirect to Repo

The aim of this project is to classify a URL as a Malicious or Benign URL. Collected a public dataset that consists of 450176 rows and two classes of URLs: Malicious or Benign. Have extracted a total of 17 features which consist of Lexical features, count-based features, and two binary features. Trained the model on Adaboost and Random Forest classifiers and used the Voting Classifier ensembling method to get the optimum result among the two classifiers

Table of Contents

This Project submitted as a thesis at GITAM University, Hyderabad.

Authors: Srinath Tummala, Shivani Donthi