Gigawords Counter

This is a C and MPI based program that is able to count the number of words in a large body of text. It is not dependent on the number of of files the text is stored in. The only requirement for the files is that there is exactly one article per line, and that any punctuation is removed. The characters must also be in all capital letters. The program will then be able to count the number of words and articles, as well as the number of distinct terms of varying lengths input by the user. We will be using the TF*IDF measure to determine the importance of each term.

Gigawords Counter Development Team

Gigawords Counter Description

To view our project site click here