Searches for queries in a corpus using a specific regular expression

Each corpus element is checked for the presence of a query. The process is repeated for multiple queries. The result is a table of queries and number of matches for each corpus row.

run_lecat_analysis(
  lexicon,
  corpus,
  searches,
  id = NaN,
  regex_expression = "\\bquery\\b",
  inShiny = FALSE,
  case_sensitive = FALSE
)

Arguments

lexicon	Lexicon dataframe as parsed by the parse_lexicon function
corpus	Corpus dataframe containing search columns present in the searches dataframe
searches	Data frame with the columns 'Type' and 'Column'. Queries in each Type will be located in the corresponding corpus Column
id	Column name to use for identifying differing corpus samples (e.g., YouTube video id). Autogenerated if no id is provided.
regex_expression	Regex expression defining search. String defining the regex expression where the string 'query' will be replaced by the actual query term
inShiny	If inShiny is TRUE then shiny based notifications will be shown
case_sensitive	If case_sensitive is TRUE then the search will be case sensitive

Value

run_lecat_analysis returns a data frame containing the lexicon, the corresponding search column for the query type and the frequency of terms by corpus id