public class SiteSpecificURLExtractor extends Object implements MultiFunction<URL,URL>
SiteSpecificConsumers to the
input.| Modifier and Type | Field and Description |
|---|---|
protected List<SiteSpecificConsumer> |
siteSpecific
the site specific consumers
|
| Modifier | Constructor and Description |
|---|---|
protected |
SiteSpecificURLExtractor()
Construct with empty list of consumers.
|
|
SiteSpecificURLExtractor(List<SiteSpecificConsumer> consumers)
Construct with the given list of consumers.
|
|
SiteSpecificURLExtractor(SiteSpecificConsumer... consumers)
Construct with the given consumers.
|
| Modifier and Type | Method and Description |
|---|---|
List<URL> |
apply(URL in)
Apply the function to the input argument and return the result(s).
|
protected List<URL> |
processURLs(URL url)
First, try all the
SiteSpecificConsumer instances loaded into
siteSpecific. |
protected List<SiteSpecificConsumer> siteSpecific
public SiteSpecificURLExtractor(List<SiteSpecificConsumer> consumers)
consumers - the consumerspublic SiteSpecificURLExtractor(SiteSpecificConsumer... consumers)
consumers - the consumersprotected SiteSpecificURLExtractor()
public List<URL> apply(URL in)
MultiFunctionapply in interface MultiFunction<URL,URL>in - the input objectprotected List<URL> processURLs(URL url)
SiteSpecificConsumer instances loaded into
siteSpecific. If any consumer takes control of a link the
consumer's output is used
if this fails use
HttpUtils#readURLAsByteArrayInputStream(URL, org.apache.http.client.RedirectStrategy)
with a StatusConsumerRedirectStrategy which specifically
disallows redirects to be dealt with automatically and forces this
function to be called for each redirect.url -