public class SiteSpecificURLExtractor extends Object implements MultiFunction<URL,URL>
SiteSpecificConsumer
s to the
input.Modifier and Type | Field and Description |
---|---|
protected List<SiteSpecificConsumer> |
siteSpecific
the site specific consumers
|
Modifier | Constructor and Description |
---|---|
protected |
SiteSpecificURLExtractor()
Construct with empty list of consumers.
|
|
SiteSpecificURLExtractor(List<SiteSpecificConsumer> consumers)
Construct with the given list of consumers.
|
|
SiteSpecificURLExtractor(SiteSpecificConsumer... consumers)
Construct with the given consumers.
|
Modifier and Type | Method and Description |
---|---|
List<URL> |
apply(URL in)
Apply the function to the input argument and return the result(s).
|
protected List<URL> |
processURLs(URL url)
First, try all the
SiteSpecificConsumer instances loaded into
siteSpecific . |
protected List<SiteSpecificConsumer> siteSpecific
public SiteSpecificURLExtractor(List<SiteSpecificConsumer> consumers)
consumers
- the consumerspublic SiteSpecificURLExtractor(SiteSpecificConsumer... consumers)
consumers
- the consumersprotected SiteSpecificURLExtractor()
public List<URL> apply(URL in)
MultiFunction
apply
in interface MultiFunction<URL,URL>
in
- the input objectprotected List<URL> processURLs(URL url)
SiteSpecificConsumer
instances loaded into
siteSpecific
. If any consumer takes control of a link the
consumer's output is used
if this fails use
HttpUtils#readURLAsByteArrayInputStream(URL, org.apache.http.client.RedirectStrategy)
with a StatusConsumerRedirectStrategy
which specifically
disallows redirects to be dealt with automatically and forces this
function to be called for each redirect.url
-