API Documentation¶
Persine is built on a triumvirate of pieces:
The
PersonaEngine
, which more or less stores the settings for everything you’d like to do, and serves as the entry point for all of your adventures.Persona
, which are the users that interact with websites. Each persona is attached to a Chrome profile, so browsing history, cookies, etc can all carry over to subsequent sessions. (Note that by default sessions do not carry information over)bridges
, which are the interfaces between Persine and the data on the website. They’re the scrapers that pull the recommendations off of the page, and the tools that enables you to write shortcuts likeyoutube:search?kittens
PersonaEngine¶
-
class
persine.
PersonaEngine
(height=1200, width=1600, screenshot_scale=0.5, screenshot=None, html=None, compress_html=True, cache_dir=None, data_dir=None, headless=False, driver=None, resume=False, ublock=False)¶ Bases:
object
- PersonaEngine is used to generate personas. You can think of it as a place
to store all of your settings.
- Parameters
height (int) – Height of the browser window
weight (int) – Width of the browser window
screenshot_scale (float) – Scaling factor for saved screenshots
screenshot (Union[str, list]) – Whether screenshots are saved, and whether they go to history or to disk
html (Union[str, list]) – Whether HTML is saved, and whether it goes to history or to disk
compress_html (boolean) – Whether HTML should be compressed or not before saving to the history
cache_dir (str) – Where to save on-disk screenshots and HTML files
data_dir (str) – Root directory where persona data (Chrome profiles) are stored
headless (boolean) – Whether to start the browser in headless mode
driver – WebDriver to use instead of starting a new one
resume (boolean) – Whether to pick up where the previous run left off.
ublock (boolean) – Whether to automatically install uBlock Origin
-
get_driver_options
(user_data_dir=None)¶ Create the options necessary to start the appropriate webdriver.Chrome instance
- Returns
webdriver.ChromeOptions
-
get_state
(driver, url)¶ Get the current state of the page.
- Returns
A representation of the current page (key, action, url, etc)
- Return type
dict
-
launch
(user_data_dir=None)¶ Launches a Chrome instance.
- Returns
webdriver.Chrome
-
persona
(name=None, resume=False)¶ Initializes a persona with the given name.
- Returns
The persona initialized by the engine.
- Return type
-
run
(driver, url)¶ Runs a command through the appropriate bridge.
- Returns
A single state representation. Will return a list of state representations if it’s a multi-step command. For example, youtube:next_up#30 to hit ‘next up’ 30 times
- Return type
Union[dict, list(dict)]
-
take_screenshot
(driver)¶ Take a screenshot of the current window.
- Returns
The resized screenshot
- Return type
Image
Persona¶
-
class
persine.
Persona
(engine, name=None, history_path=None, user_data_dir=None, resume=False, overwrite=False)¶ Bases:
object
The Persona represents a single user. If it is given a name, it is associated with an individual Chrome profile.
- Parameters
engine (PersonaEngine) – The engine to associate with this persona
name (str) – The name to be given to this profile. If not named, an empty profile is used.
history_path (str) – Path to the JSON file that holds this persona’s action/browsing history
user_data_dir (str) – If specified, load the Chrome profile from this folder
resume (boolean) – Whether this persona should resume a previous persona with the same name. If False, the previous Chrome profile is deleted.
overwrite (boolean) – Whether to prompt the user when overwriting a previous persona’s Chrome profile (see resume)
-
clear
()¶ Deletes all previous data for that Chrome profile, including history file and user_data_dir
-
launch
()¶ Launches a browser through PersonaEngine
-
load_history
()¶ Loads the browsing/command history from a file
-
quit
()¶ Quits the browser
-
run
(url, notes=None)¶ Runs a single command and updates the history :param url: The action to run or URL to visit :type url: str :param notes: Additional information to include in the history row :type notes: dict
- Returns
- A single state representation. Will return
a list of state representations if it’s a multi-step command. For example, youtube:next_up#30 to hit ‘next up’ 30 times
- Return type
Union[dict, list(dict)]
-
run_batch
(urls)¶ Run a series of commands
-
save_history
()¶ Saves the browsing/command history to a file
-
update_history
(state, notes=None)¶ Updates history/recommendations lists with the given state
Bridges¶
-
class
persine.bridges.
AmazonBridge
(driver)¶ Bases:
persine.bridges.bridge.BaseBridge
A bridge that interacts with and scrapes Amazon
-
get_data
()¶ Return import data from the page, as well as a list of the recommendations
- Returns
Representation of the page
- Return type
dict
-
run
(url)¶ Run an action/visit a URL.
- Returns
Representation of the page
- Return type
dict
-
-
class
persine.bridges.
BaseBridge
(driver)¶ Bases:
object
A completely useless Bridge that at least shows you what they’re supposed to implement
- Parameters
driver – A Selenium WebDriver used to navigate
-
get_data
()¶ Return import data from the page, as well as a list of the recommendations
- Returns
Representation of the page
- Return type
dict
-
run
(url)¶ Run an action/visit a URL.
- Returns
Representation of the page
- Return type
dict
-
class
persine.bridges.
YoutubeBridge
(driver)¶ Bases:
persine.bridges.bridge.BaseBridge
A bridge that interacts with and scrapes YouTube
-
get_data
()¶ Return import data from the page, as well as a list of the recommendations
- Returns
Representation of the page
- Return type
dict
-
run
(url)¶ Run an action/visit a URL.
- Returns
Representation of the page
- Return type
dict
-