public class ImageSiteURLExtractor extends SiteSpecificURLExtractor
SiteSpecificConsumer
s for common image hosting sites to
determine if the input URL is likely to lead to an image of images.
Currently, the following consumers are included:
siteSpecific
Constructor and Description |
---|
ImageSiteURLExtractor()
Default constructor; includes tumblr support.
|
ImageSiteURLExtractor(boolean tumblr)
Construct with or without Tumblr support
|
ImageSiteURLExtractor(boolean tumblr,
boolean fallback)
Construct with or without Tumblr support
|
Modifier and Type | Method and Description |
---|---|
protected List<URL> |
processURLs(URL url)
First, try all the
SiteSpecificConsumer instances loaded into
SiteSpecificURLExtractor.siteSpecific . |
apply
public ImageSiteURLExtractor(boolean tumblr, boolean fallback)
tumblr
- true if tumblr is required.fallback
- true if should try to download directlypublic ImageSiteURLExtractor(boolean tumblr)
tumblr
- true if tumblr is required.public ImageSiteURLExtractor()
protected List<URL> processURLs(URL url)
SiteSpecificConsumer
instances loaded into
SiteSpecificURLExtractor.siteSpecific
. If any consumer takes control of a link the
consumer's output is used
if this fails use
HttpUtils.readURLAsByteArrayInputStream(URL, org.apache.http.client.RedirectStrategy)
with a StatusConsumerRedirectStrategy
which specifically
disallows redirects to be dealt with automatically and forces this
function to be called for each redirect.processURLs
in class SiteSpecificURLExtractor
url
-