I need to count the number of occurrences of each word in a large text file (>5GB).
I was thinking about creating a HashMap and while traversing the file I will update the count for each word as I go along.
I can’t read the whole file at once because it will bog down my memory, so I thought of reading it in chunks. I could only find ways of reading a text file line-by-line in this link:
However, I couldn’t find a way to read chunks of the file in case it doesn’t have line breaks (one large row). How can I, for example, read chunks of 1000 words in each read?