public class HadoopToolsUtil extends Object
Constructor and Description |
---|
HadoopToolsUtil() |
Modifier and Type | Method and Description |
---|---|
static String[] |
decodeArgs(String[] args)
A horrible hack to deal with hadoop's horrible hack when setting arrays of strings as configs
|
static String[] |
encodeArgs(String[] args)
A horrible hack to deal with hadoop's horrible hack when setting arrays of strings as configs
|
static boolean |
fileExists(String path)
Use hadoop filesystem to check if the given path exists
|
static org.apache.hadoop.fs.FileSystem |
getFileSystem(org.apache.hadoop.fs.Path p)
Get the
FileSystem corresponding to a Path . |
static org.apache.hadoop.fs.FileSystem |
getFileSystem(URI uri) |
static org.apache.hadoop.fs.Path[] |
getInputPaths(InOutToolOptions options)
Get the input paths from an
InOutToolOptions . |
static org.apache.hadoop.fs.Path[] |
getInputPaths(String path)
Get the input paths from a String.
|
static org.apache.hadoop.fs.Path[] |
getInputPaths(String[] paths) |
static org.apache.hadoop.fs.Path[] |
getInputPaths(String[] paths,
String subdir)
All the files starting with "part" in the paths which look like: "paths[i]/subdir
|
static org.apache.hadoop.fs.Path |
getOutputPath(InOutToolOptions options)
Get the output path from an
InOutToolOptions . |
static org.apache.hadoop.fs.Path |
getOutputPath(String path)
Get the output path from a String.
|
static String[] |
readlines(String p)
Read a whole hadoop file into a string.
|
static void |
removeFile(String f)
Delete a file
|
static void |
validateInput(InOutToolOptions tool) |
static void |
validateOutput(InOutToolOptions tool) |
static void |
validateOutput(String outpath,
boolean replace) |
public HadoopToolsUtil()
public static void validateOutput(InOutToolOptions tool) throws org.kohsuke.args4j.CmdLineException
tool
- options to get data fromorg.kohsuke.args4j.CmdLineException
public static void validateOutput(String outpath, boolean replace) throws org.kohsuke.args4j.CmdLineException
outpath
- The desired outputreplace
- whether the existing outputs should be removedorg.kohsuke.args4j.CmdLineException
public static org.apache.hadoop.fs.FileSystem getFileSystem(URI uri) throws IOException
uri
- IOException
public static org.apache.hadoop.fs.FileSystem getFileSystem(org.apache.hadoop.fs.Path p) throws IOException
FileSystem
corresponding to a Path
.p
- the path.IOException
public static void validateInput(InOutToolOptions tool) throws org.kohsuke.args4j.CmdLineException
tool
- org.kohsuke.args4j.CmdLineException
public static void removeFile(String f) throws IOException
f
- the file to deleteIOException
public static org.apache.hadoop.fs.Path getOutputPath(InOutToolOptions options)
InOutToolOptions
.options
- the InOutToolOptions
.public static org.apache.hadoop.fs.Path getOutputPath(String path)
path
- the path stringpublic static org.apache.hadoop.fs.Path[] getInputPaths(InOutToolOptions options) throws IOException
InOutToolOptions
. This will resolve the input path
and return either a Path
object representing the string
or, if the path string is a directory, a list of Path
s
representing all the "part" files.options
- the InOutToolOptions
.IOException
public static org.apache.hadoop.fs.Path[] getInputPaths(String path) throws IOException
Path
object representing the string
or, if the path string is a directory, a list of Path
s
representing all the "part" files.path
- the path stringIOException
public static org.apache.hadoop.fs.Path[] getInputPaths(String[] paths) throws IOException
paths
- IOException
public static org.apache.hadoop.fs.Path[] getInputPaths(String[] paths, String subdir) throws IOException
paths
- subdir
- IOException
public static boolean fileExists(String path) throws IOException
path
- the path to the fileIOException
public static String[] readlines(String p) throws IOException
p
- a pathIOException
public static String[] encodeArgs(String[] args)
args
- public static String[] decodeArgs(String[] args)
args
-