Implementation of the SnowballC stemmer. Note that punctuation and capitals letters are also removed.
pr_stem_words(df, col, language = "french")
df | the data.frame containing the sentences |
---|---|
col | the column with the sentences |
language | the language of the words Defaut is french. See SnowballC::getStemLanguages() function for a list of supported languages. |
a tibble
a <- data.frame(words = c("matin", "heure", "fatigué","sonné","lois", "tests","fusionner")) pr_stem_words(a, words)#> # A tibble: 7 x 1 #> words #> * <chr> #> 1 matin #> 2 heur #> 3 fatigu #> 4 son #> 5 lois #> 6 test #> 7 fusion