API Documentation

Persine is built on a triumvirate of pieces:

  1. The PersonaEngine, which more or less stores the settings for everything you’d like to do, and serves as the entry point for all of your adventures.

  2. Persona, which are the users that interact with websites. Each persona is attached to a Chrome profile, so browsing history, cookies, etc can all carry over to subsequent sessions. (Note that by default sessions do not carry information over)

  3. bridges, which are the interfaces between Persine and the data on the website. They’re the scrapers that pull the recommendations off of the page, and the tools that enables you to write shortcuts like youtube:search?kittens

PersonaEngine

class persine.PersonaEngine(height=1200, width=1600, screenshot_scale=0.5, screenshot=None, html=None, compress_html=True, cache_dir=None, data_dir=None, headless=False, driver=None, resume=False, ublock=False)

Bases: object

PersonaEngine is used to generate personas. You can think of it as a place

to store all of your settings.

Parameters
  • height (int) – Height of the browser window

  • weight (int) – Width of the browser window

  • screenshot_scale (float) – Scaling factor for saved screenshots

  • screenshot (Union[str, list]) – Whether screenshots are saved, and whether they go to history or to disk

  • html (Union[str, list]) – Whether HTML is saved, and whether it goes to history or to disk

  • compress_html (boolean) – Whether HTML should be compressed or not before saving to the history

  • cache_dir (str) – Where to save on-disk screenshots and HTML files

  • data_dir (str) – Root directory where persona data (Chrome profiles) are stored

  • headless (boolean) – Whether to start the browser in headless mode

  • driver – WebDriver to use instead of starting a new one

  • resume (boolean) – Whether to pick up where the previous run left off.

  • ublock (boolean) – Whether to automatically install uBlock Origin

get_driver_options(user_data_dir=None)

Create the options necessary to start the appropriate webdriver.Chrome instance

Returns

webdriver.ChromeOptions

get_state(driver, url)

Get the current state of the page.

Returns

A representation of the current page (key, action, url, etc)

Return type

dict

launch(user_data_dir=None)

Launches a Chrome instance.

Returns

webdriver.Chrome

persona(name=None, resume=False)

Initializes a persona with the given name.

Returns

The persona initialized by the engine.

Return type

Persona

run(driver, url)

Runs a command through the appropriate bridge.

Returns

A single state representation. Will return a list of state representations if it’s a multi-step command. For example, youtube:next_up#30 to hit ‘next up’ 30 times

Return type

Union[dict, list(dict)]

take_screenshot(driver)

Take a screenshot of the current window.

Returns

The resized screenshot

Return type

Image

Persona

class persine.Persona(engine, name=None, history_path=None, user_data_dir=None, resume=False, overwrite=False)

Bases: object

The Persona represents a single user. If it is given a name, it is associated with an individual Chrome profile.

Parameters
  • engine (PersonaEngine) – The engine to associate with this persona

  • name (str) – The name to be given to this profile. If not named, an empty profile is used.

  • history_path (str) – Path to the JSON file that holds this persona’s action/browsing history

  • user_data_dir (str) – If specified, load the Chrome profile from this folder

  • resume (boolean) – Whether this persona should resume a previous persona with the same name. If False, the previous Chrome profile is deleted.

  • overwrite (boolean) – Whether to prompt the user when overwriting a previous persona’s Chrome profile (see resume)

clear()

Deletes all previous data for that Chrome profile, including history file and user_data_dir

launch()

Launches a browser through PersonaEngine

load_history()

Loads the browsing/command history from a file

quit()

Quits the browser

run(url, notes=None)

Runs a single command and updates the history :param url: The action to run or URL to visit :type url: str :param notes: Additional information to include in the history row :type notes: dict

Returns

A single state representation. Will return

a list of state representations if it’s a multi-step command. For example, youtube:next_up#30 to hit ‘next up’ 30 times

Return type

Union[dict, list(dict)]

run_batch(urls)

Run a series of commands

save_history()

Saves the browsing/command history to a file

update_history(state, notes=None)

Updates history/recommendations lists with the given state

Bridges

class persine.bridges.AmazonBridge(driver)

Bases: persine.bridges.bridge.BaseBridge

A bridge that interacts with and scrapes Amazon

get_data()

Return import data from the page, as well as a list of the recommendations

Returns

Representation of the page

Return type

dict

run(url)

Run an action/visit a URL.

Returns

Representation of the page

Return type

dict

class persine.bridges.BaseBridge(driver)

Bases: object

A completely useless Bridge that at least shows you what they’re supposed to implement

Parameters

driver – A Selenium WebDriver used to navigate

get_data()

Return import data from the page, as well as a list of the recommendations

Returns

Representation of the page

Return type

dict

run(url)

Run an action/visit a URL.

Returns

Representation of the page

Return type

dict

class persine.bridges.YoutubeBridge(driver)

Bases: persine.bridges.bridge.BaseBridge

A bridge that interacts with and scrapes YouTube

get_data()

Return import data from the page, as well as a list of the recommendations

Returns

Representation of the page

Return type

dict

run(url)

Run an action/visit a URL.

Returns

Representation of the page

Return type

dict