This question evaluates understanding of basic probabilistic language modeling—specifically bigram/first-order Markov models and weighted sampling for next-word prediction—within the Machine Learning (natural language processing) domain, emphasizing practical implementation skills alongside conceptual scalability trade-offs.
You are given a training set of token sequences (sentences), for example:
[["a","b","c"],
["a","s","d"]]
w
, counts which words most frequently appear
immediately after
w
(a bigram / 1st-order Markov model).
w
, output a
random next word
sampled
proportionally to the observed counts
after
w
(i.e., weighted by frequency).