TikTok pages are failing to load with Selenium

I'm currently working on a TikTok crawler project that uses both selenium and scrapy

start_urls = ['https://www.tiktok.com/trending']
....
def parse(self, response):
    options = webdriver.ChromeOptions()
    from fake_useragent import UserAgent
    ua = UserAgent()
    user_agent = ua.random
    options.add_argument(f'user-agent={user_agent}')
    options.add_argument('window-size=800x841')
    driver = webdriver.Chrome(chrome_options=options)
    driver.get(response.url)

Although the crawler successfully opens Chrome, it fails to load videos. View image here

The problem is persistent even when using Firefox browser. No video loading issue in Firefox

Similar issues arise when utilizing a basic script with Selenium

from selenium import webdriver
import time


driver = webdriver.Firefox()
driver.get("https://www.tiktok.com/trending")
time.sleep(10)
driver.close()

driver = webdriver.Chrome()
driver.get("https://www.tiktok.com/trending")
time.sleep(10)
driver.close()

Answer №1

Have you attempted to explore further within the selenium browser window? If you encounter an Error 404 on certain websites, there is a solution that worked for me:

I adjusted my User-Agent to "Naverbot" which is permitted according to the robots.txt file from Tik Tok

(Robots.txt)

After making this change, all sites and videos loaded correctly.

Other user-agents listed under the "allow" section should also work if you want to rotate them.

Answer №2

If you're looking for an alternative to Chrome or Firefox, consider using Windows IE. It may offer a different layout for viewing videos compared to the other browsers.

Here are some possible reasons why your page isn't loading:

Some advanced web applications may check your browser history, profile data, and cache to verify user authentication. To address this, try running your default profile within Selenium as it could be helpful.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Guide to extracting video views from Instagram JSON data

I'm currently working on extracting the likes count, comments count, and video views from Instagram JSON data. While I've been able to successfully retrieve likes and comments counts, I'm having trouble figuring out how to access the video v ...

The choices can be found within the "_listener" attribute

Consider the following tag: <select class="xyz"_listener= "<select class="gwt-listbox"><option value="Select the option"></option> <option value="Name1"></option> <option value="Name2"></option> <option valu ...

What is the reason behind the lack of granularity in the ruby `require` statement?

Unique example for better understanding In my script file, labeled as c.py, I have defined two classes: Elephant and Giraffe. When working with Python, to utilize the Elephant class from the c.py file, an import statement is used: import c print(Elepha ...

Allocate proper memory allocation for storing strings within a byte array

Given an array x that contains a string followed by all zeros, the task is to expand the string while keeping the size of the x array the same. For example: def modify(x,y): return x # input x and y x = bytes([0x41,0x42,0x43,0x44,0x00,0x00,0x00,0x00, ...

Add the necessary static files to the installation directory of a Python egg sdist

Currently, I am working on a Python3 application that relies on a specific set of static files within the project structure. Here is an overview of the project setup: myBlanky \__blankys \__bootstrap |__google_app_engine &b ...

Maintaining the serialisation of subelement namespaces using the lxml library

Having trouble combining multiple XML documents into one using lxml. The issue is with preserving namespaces on the sub-documents' root nodes. Lxml seems to push repeated namespace declarations to the new document's root, causing a bug in my appl ...

An error in Selenium WebDriver occurs specifically with the chromedriver.exe file

While attempting to execute the command below in python/Selenium using selenium import webdriver browser=webdriver.Chrome("C:\chromedriver.exe") I encountered the following exception: selenium.common.exceptions.WebDriverException: Message: unknown ...

Selenium Python to obtain tooltip text

Currently, I am attempting to extract dynamic content that only appears when hovering over certain elements. Despite utilizing ActionChains from Selenium for mouse movement and hover actions, I have been unable to capture the desired text. The main issue ...

Struggling to decipher HTML elements and run a non-functioning script? Seek assistance now

<li tabindex="0" role="tab" aria-selected="false"> <a href="#gift-cards" class="leftnav-links kas-leftnav-links" data-section="gift-cards" data-ajaxurl="/wallet/my_wallet.jsp"> <span class="width200 kas-gift-ca ...

The Rise and Fall of Python: A Study of Ascendance and

Forgive me for the simplicity of my question, but I am looking to display: The total number and types of nodes with 0, 1, 2, or 3 children. The total number of nodes with 0, 1, 2, or 3 parents. Below is the simple script I have written. Thank you. Vicin ...

Struggling with Selenium WebDriver failing to identify an element, despite numerous attempts to resolve the issue

I'm still a beginner when it comes to using Selenium WebDriver, but I've managed to write a couple of JUnit tests with it. However, now that I'm on my third test, I'm facing an issue where a specific element cannot be located. The error ...

Troubleshooting Issue: Selenium Python ActionChain not functioning due to element being unresponsive

I'm currently stuck on an issue with ActionChains. Despite my best efforts, I can't seem to get the mouse to move as intended. Any ideas or suggestions would be greatly appreciated. from selenium import webdriver from selenium.webdriver.common.ac ...

Having difficulty in flushing the pexpect buffer in Python version 3.X

Currently, I am utilizing the Pexpect module to establish a connection with a remote server. Despite successfully sending and receiving responses from the server, I am encountering an issue when attempting to clear a buffer by anticipating irrelevant data ...

Exploring the World of Buttons and Canvas in tkinter

My code is intended to display a keyboard design with 4 rows and 16 columns. https://i.stack.imgur.com/Guu5r.png In the provided image, there seems to be an issue with one of the buttons ('A') missing from its expected position in the top right ...

When using retrbinary() and storbinary() in ftplib, will an exception be raised if the transfer is not successful?

Are exceptions raised by the retrbinary() and storbinary() functions in the ftplib module when a transfer is unsuccessful, or do I need to explicitly check for this? For example, I currently have code that... ftp = ftplib.FTP(<all the connection info&g ...

Obtain the next sibling element using Java and Selenium

On this page, I am trying to access the Game Locations table: <h3><span class="mw-headline" id="Game_locations">Game locations</span></h3> <table class="roundy" style="margin:auto; border: 3px ...

Tips for managing popup browsers in Robot Framework

When using Robot Framework, I encountered a challenge with handling pop up browsers to allow access to the camera. The issue arises when trying to press Enter on the selected 'Allow' button after navigating there via the TAB key. Despite various ...

What could be the issue with this binary transformation?

I have been developing a program to convert integers into binary numbers, and everything has been working smoothly except for one issue that I cannot seem to resolve. Currently, the binary numbers are stored in a list, and my objective is to concatenate th ...

Converting HTML table to pandas dataframe: Extracting data from HTML elements

After extracting data from a large table on the web using requests and BeautifulSoup, I encountered an issue with specific parts of the information. Here is a snippet of the table: <table> <tbody> <tr> <td>265</td> <td> ...

Using Beautiful Soup, extract various elements from a webpage in a repeated sequence

I'm trying to scrape a table that contains a loop, but I'm running into issues with extracting certain elements. <ul> <li class="cell036 tal arrow"><a href=" y/">ALdCTL</a></li> <li class="cell009">5,71</li ...