Skip to content

Commit daace27

Browse files
authored
remove outdated instructions
1 parent ab64536 commit daace27

File tree

1 file changed

+2
-37
lines changed

1 file changed

+2
-37
lines changed

assignment4/README.txt

Lines changed: 2 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,3 @@
1-
Instructions on how to run example.pig.
1+
This repository describes an assignment for working with ~600GB of graph data using modern analytics languages.
22

3-
================================================================
4-
5-
STEP 1:
6-
7-
Importing the myudfs.jar file in pig. You need this because
8-
example.pig uses the function RDFSplit3(...) which is defined in myudfs.jar:
9-
10-
OPTION 1: Do nothing. example.pig is already configured to read
11-
myudfs.jar from S3, through the line:
12-
13-
register s3n://uw-cse-344-oregon.aws.amazon.com/myudfs.jar
14-
15-
16-
OPTION 2: do-it-yourself; run this on your local machine:
17-
18-
cd pigtest
19-
ant -- this should create the file myudfs.jar
20-
21-
Next, modify example.pig to:
22-
23-
register ./myudfs.jar
24-
25-
Next, after you start the AWS cluster, copy myudfs.jar to the AWS
26-
Master Node (see hw6-awsusage.html).
27-
28-
================================================================
29-
30-
STEP2
31-
32-
Start an AWS Cluster (see hw6-awsusage.html), start pig interactively,
33-
and cut and paste the content of example.pig. I prefer to do this line by line
34-
35-
36-
Note: The program may appear to hang with a 0% completion time... go check the job tracker. Scroll down. You should see a MapReduce job running with some non-zero progress.
37-
38-
Also note that the script will generate more than one MapReduce job.
3+
Start with assignment4.md

0 commit comments

Comments
 (0)