Each corpus element is checked for the presence of a query. The process is repeated for multiple queries. The result is a table of queries and number of matches for each corpus row.

run_lecat_analysis(
  lexicon,
  corpus,
  searches,
  id = NaN,
  regex_expression = "\\bquery\\b",
  inShiny = FALSE,
  case_sensitive = FALSE
)

Arguments

lexicon

Lexicon dataframe as parsed by the parse_lexicon function

corpus

Corpus dataframe containing search columns present in the searches dataframe

searches

Data frame with the columns 'Type' and 'Column'. Queries in each Type will be located in the corresponding corpus Column

id

Column name to use for identifying differing corpus samples (e.g., YouTube video id). Autogenerated if no id is provided.

regex_expression

Regex expression defining search. String defining the regex expression where the string 'query' will be replaced by the actual query term

inShiny

If inShiny is TRUE then shiny based notifications will be shown

case_sensitive

If case_sensitive is TRUE then the search will be case sensitive

Value

run_lecat_analysis returns a data frame containing the lexicon, the corresponding search column for the query type and the frequency of terms by corpus id