The Magnificence Of Data Science

The Magnificence Of Data Science

Blog Writer|Content Creator Tummalagunta Sai Sri Harshitha Let's Go Folks !!...

What is Data Science?

It is the art of solving problems using tools and technologies such as Machine Learning, Deep Learning and Reinforcement Learning, to draw meaningful insights to solve problems and ultimately to make decisions.

What is the importance of Data Science?

Earlier we used to have excel or spreadsheets and the data used to be in a structured format. But nowadays we can see a huge amount of data being generated every second. Handling such huge data is quite challenging for any organization. Here data science plays a crucial role.

What is the difference between data science and AI?

Data Science solves the problems whereas AI simplifies the problems.

What is Data?

simply data is nothing but a collection of information which might be some facts or stats etc.

Why is data stored?

  • For retrieval of the data for future purposes.

  • To audit the data quality. For instance, is it valuable data? Is it accurate data?

  • Also for quick recovery of files when there is a system crash.

    Let's go step by step and understand what is the role of a data scientist

  • When a client comes up with a problem statement, a data scientist must do the following:

    1. Understanding the Business Problem

    • When there is a meeting with the client the data scientist should listen and understand the client's problem statement and then ask suitable questions and the client's requirements. Also should define the objective of the problem that needs to be addressed.

    • To become a good data scientist you need to be very curious about knowing "WH" type of questions like what, why etc

2. Data Acquisition

  • Being data scientists we should congregate data from multiple sources like Data Bases, logs etc.

3. Data Preparation

  • Data cleaning stage :

    Data cleaning is a time-consuming process as it involves many complex scenarios. For instance, replacing the missing values with specified values.

  • Data transformation stage:

    According to the specific mapping rules, we modify the data.

Hey!! Hang on!! Are you curious to know what are mapping rules???

  • Mapping rules tell us about the source database and target database.

4. Exploratory Data Analysis (In short we call it EDA)

  • The name itself suggests exploring the data. Understanding the data is very important. It helps us find the correlation between the data in a given data set.

  • So if we skip the EDA part then it will certainly result in errors which will finally affect when we develop our model.

5. Data Modeling

  • This is the main part of a data science project.

  • Data scientists apply different algorithms(model training stage) and find the best-fit algorithm for the business specifications.

  • To illustrate, if our model has an accuracy of 95% with K-Nearest Neighbours and 98% with the Decision Tree then obviously we gonna choose Decision Tree and declare that the Decision Tree is the best algorithm.

  • Cool folks !!! No worries we are gonna learn about these algorithms in my future chronicles(writings).

6. Data Visualization

  • All the business findings are communicated with the clients effectively and easily by visualizing the data using the tools like Tableau, Power BI, Rapidminer, and Qlickview for creating effective graphs, charts, and maps in the form of dashboards and reports.

  • This makes the client understand the data more easily and effectively and it will be eye-catching too.

7. Model evaluation and Testing

  • Using evaluation metrics we find the accuracy of the model as stated in the previous lines. Here on a testing data set the performance of a fully trained model is evaluated.

8. Productionization or Model Deployment stage

  • In this stage, we are going to put our model in the production environment. This is called deploying the model. In simple words making the model come into action and be readily available for the end user.

9. Model Monitoring or Model performance tracking

  • It helps to check the performance of the data and in case there are any errors we can debug it and make the model perform a much better way.

Conclusion

We just went through the very basic journey of a data scientist. Let's dive deep into more topics in my future blogs.

Blog Writer | Content creator of phenomenalharshis blog on hashnode.com : Tummalagunta Sai Sri Harshitha

END NOTE: Be curious to learn something new every day!

Free Thank You Thank You Card photo and picture

Plant a tree and plant a dream !!!!