Reader small image

You're reading from  MATLAB for Machine Learning - Second Edition

Product typeBook
Published inJan 2024
Reading LevelIntermediate
PublisherPackt
ISBN-139781835087695
Edition2nd Edition
Languages
Tools
Right arrow
Author (1)
Giuseppe Ciaburro
Giuseppe Ciaburro
author image
Giuseppe Ciaburro

Giuseppe Ciaburro holds a PhD and two master's degrees. He works at the Built Environment Control Laboratory - Università degli Studi della Campania "Luigi Vanvitelli". He has over 25 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in MATLAB, Python and R. As an expert in AI applications to acoustics and noise control problems, Giuseppe has wide experience in researching and teaching. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He was recently included in the world's top 2% scientists list by Stanford University (2022).
Read more about Giuseppe Ciaburro

Right arrow

Exploring corpora and word and sentence tokenizers

The analysis of corpora, words, and sentence tokenization forms the basis for comprehensive language understanding. Corpora provides real-world language data for analysis, words constitute the elements of expression, and sentence tokenization structures the text into meaningful units for further investigation. This trio of concepts plays a central role in advancing linguistic research and enhancing NLP capabilities.

Corpora

In linguistics and NLP, corpora refer to extensive collections of written or spoken texts that serve as valuable sources of data for linguistic analysis and language-related studies. Corpora provides a diverse range of language samples, enabling researchers to examine patterns, trends, and variations in language usage, syntax, and semantics across different contexts and genres.

Linguistic corpora represent sizable collections of spoken or written texts, often originating from authentic communication contexts...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
MATLAB for Machine Learning - Second Edition
Published in: Jan 2024Publisher: PacktISBN-13: 9781835087695

Author (1)

author image
Giuseppe Ciaburro

Giuseppe Ciaburro holds a PhD and two master's degrees. He works at the Built Environment Control Laboratory - Università degli Studi della Campania "Luigi Vanvitelli". He has over 25 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in MATLAB, Python and R. As an expert in AI applications to acoustics and noise control problems, Giuseppe has wide experience in researching and teaching. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He was recently included in the world's top 2% scientists list by Stanford University (2022).
Read more about Giuseppe Ciaburro