Reader small image

You're reading from  The Statistics and Machine Learning with R Workshop

Product typeBook
Published inOct 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781803240305
Edition1st Edition
Languages
Right arrow
Author (1)
Liu Peng
Liu Peng
author image
Liu Peng

Peng Liu is an Assistant Professor of Quantitative Finance (Practice) at Singapore Management University and an adjunct researcher at the National University of Singapore. He holds a Ph.D. in statistics from the National University of Singapore and has ten years of working experience as a data scientist across the banking, technology, and hospitality industries.
Read more about Liu Peng

Right arrow

Summary

In this chapter, we covered essential functions and techniques for data transformation, aggregation, and merging. For data transformation at the row level, we learned about common utility functions such as filter(), mutate(), select(), arrange(), top_n(), and transmute(). For data aggregation, which summarizes the raw dataset into a smaller and more concise summary view, we introduced functions such as count(), group_by(), and summarize(). For data merging, which combines multiple datasets into one, we learned about different joining methods, including inner_join(), left_join(), right_join(), and full_join(). Although there are other more advanced joining functions, the essential tools we covered in our toolkit are enough for us to achieve the same task. Finally, we went through a case study based on the Stack Overflow dataset. The skills we learned in this chapter will come in very handy in many data analysis tasks.

In the next chapter, we will cover a more advanced topic...

lock icon
The rest of the page is locked
Previous PageNext Chapter
You have been reading a chapter from
The Statistics and Machine Learning with R Workshop
Published in: Oct 2023Publisher: PacktISBN-13: 9781803240305

Author (1)

author image
Liu Peng

Peng Liu is an Assistant Professor of Quantitative Finance (Practice) at Singapore Management University and an adjunct researcher at the National University of Singapore. He holds a Ph.D. in statistics from the National University of Singapore and has ten years of working experience as a data scientist across the banking, technology, and hospitality industries.
Read more about Liu Peng