The distributed cache can be used to make small files (or jars etc.) available to mapreduce functions locally on each node. This can be useful e.g. when a global stopword list is needed by all mappers for index creation. Here are two correct ways of reading a file from distributed cache in Hadoop 2. This has changed in the new API and very few books and tutorials have updated examples.
Named File
In the driver:
Job job = Job.getInstance(new Configuration());
job.addCacheFile(new URI ("/path/to/file.csv" + "#filelabel"));
In the mapper:
@Override
public void setup(Context context …