Integrity
Write
Loading...
Ben "The Hosk" Hosking

Ben "The Hosk" Hosking

3 years ago

The Yellow Cat Test Is Typically Failed by Software Developers.

More on Technology

Frank Andrade

Frank Andrade

2 years ago

I discovered a bug that allowed me to use ChatGPT to successfully web scrape. Here's how it operates.

This method scrapes websites with ChatGPT (demo with Amazon and Twitter)

Photo by Mikhail Nilov on Pexels

In a recent article, I demonstrated how to scrape websites using ChatGPT prompts like scrape website X using Python.

But that doesn’t always work.

After scraping dozens of websites with ChatGPT, I realized that simple prompts rarely work for web scraping.

Using ChatGPT and basic HTML, we can scrape any website.

First things first, use ChatGPT's advanced version (Playground)

ChatGPT Playground's enhanced version is needed to scrape websites quickly. It generates code faster and has less constraints.

Here’s how it looks.

The classic ChatGPT display just allows you to type a prompt. Playground is more customizable and generates code faster than the basic version.

No more delays or restrictions.

In this tutorial, we'll write prompts underneath Playground.

How to Use ChatGPT to Scrape Any Website

To show you how to utilize ChatGPT to scrape any website, we'll use subslikescript, which lists movies.

Later in this post, I'll show you how to scrape Amazon and Twitter, but let's start easy.

If we try a basic prompt like “scrape the movie titles on this website: https://subslikescript.com/movies” it won’t scrape anything. Here’s when we need some HTML knowledge to build the right prompt for ChatGPT.Let's scrape the above movie titles. First, check the webpage. Right-click any element and choose Inspect.

Then see below.

We identify the element to scrape (those in the red box). The data elements are ul, a, and scripts-list.

Here's the prompt for movie titles.

Scrape this website: https://subslikescript.com/movies with Python and Beautiful Soup. Locate the element with tag “ul” and class “scripts-list”. Scrape all the “a” elements inside. Get the text attribute and print it.

Code generated:

import requests
from bs4 import BeautifulSoup

page = requests.get('https://subslikescript.com/movies/')

soup = BeautifulSoup(page.content, 'html.parser')

scripts_list = soup.find(class_="scripts-list")

all_a_elements = scripts_list.find_all('a')

for element in all_a_elements:
    print(element.get_text())

It extracts movie titles successfully.

Let's scrape Amazon and Twitter.

ChatGPT's Amazon scraping

Consider scraping Amazon for self-help books. First, copy the Amazon link for self-help books.

Here’s the link I got. Location-dependent connection. Use my link to replicate my results.

Now we'll check book titles. Here's our element.

If we want to extract the book titles, we need to use the tag name spanclass attribute name and a-size-base-plus a-color-base a-text-normalattribute value.

This time I'll use Selenium. I'll add Selenium-specific commands like wait 5 seconds and generate an XPath.

Scrape this website https://www.amazon.com/s?k=self+help+books&sprefix=self+help+%2Caps%2C158&ref=nb_sb_ss_ts-doa-p_2_10 with Python and Selenium.

Wait 5 seconds and locate all the elements with the following xpath: “span” tag, “class” attribute name, and “a-size-base-plus a-color-base a-text-normal” attribute value. Get the text attribute and print them.

Code generated: (I only had to manually add the path where my chromedriver is located).

from selenium import webdriver
from selenium.webdriver.common.by import By
from time import sleep

#initialize webdriver
driver = webdriver.Chrome('<add path of your chromedriver>')

#navigate to the website
driver.get("https://www.amazon.com/s?k=self+help+books&sprefix=self+help+%2Caps%2C158&ref=nb_sb_ss_ts-doa-p_2_10")

#wait 5 seconds to let the page load
sleep(5)

#locate all the elements with the following xpath
elements = driver.find_elements(By.XPATH, '//span[@class="a-size-base-plus a-color-base a-text-normal"]')

#get the text attribute of each element and print it
for element in elements:
    print(element.text)

#close the webdriver
driver.close()

It pulls Amazon book titles.

Utilizing ChatGPT to scrape Twitter

Say you wish to scrape ChatGPT tweets. Search Twitter for ChatGPT and copy the URL.

Here’s the link I got. We must check every tweet. Here's our element.

To extract a tweet, use the div tag and lang attribute.

Again, Selenium.

Scrape this website: https://twitter.com/search?q=chatgpt&src=typed_query using Python, Selenium and chromedriver.

Maximize the window, wait 15 seconds and locate all the elements that have the following XPath: “div” tag, attribute name “lang”. Print the text inside these elements.

Code generated: (again, I had to add the path where my chromedriver is located)

from selenium import webdriver
import time

driver = webdriver.Chrome("/Users/frankandrade/Downloads/chromedriver")
driver.maximize_window()
driver.get("https://twitter.com/search?q=chatgpt&src=typed_query")
time.sleep(15)

elements = driver.find_elements_by_xpath("//div[@lang]")
for element in elements:
    print(element.text)

driver.quit()

You'll get the first 2 or 3 tweets from a search. To scrape additional tweets, click X times.

Congratulations! You scraped websites without coding by using ChatGPT.

Shalitha Suranga

Shalitha Suranga

3 years ago

The Top 5 Mathematical Concepts Every Programmer Needs to Know

Using math to write efficient code in any language

Photo by Emile Perron on Unsplash, edited with Canva

Programmers design, build, test, and maintain software. Employ cases and personal preferences determine the programming languages we use throughout development. Mobile app developers use JavaScript or Dart. Some programmers design performance-first software in C/C++.

A generic source code includes language-specific grammar, pre-implemented function calls, mathematical operators, and control statements. Some mathematical principles assist us enhance our programming and problem-solving skills.

We all use basic mathematical concepts like formulas and relational operators (aka comparison operators) in programming in our daily lives. Beyond these mathematical syntaxes, we'll see discrete math topics. This narrative explains key math topics programmers must know. Master these ideas to produce clean and efficient software code.

Expressions in mathematics and built-in mathematical functions

A source code can only contain a mathematical algorithm or prebuilt API functions. We develop source code between these two ends. If you create code to fetch JSON data from a RESTful service, you'll invoke an HTTP client and won't conduct any math. If you write a function to compute the circle's area, you conduct the math there.

When your source code gets more mathematical, you'll need to use mathematical functions. Every programming language has a math module and syntactical operators. Good programmers always consider code readability, so we should learn to write readable mathematical expressions.

Linux utilizes clear math expressions.

A mathematical expression/formula in the Linux codebase, a screenshot by the author

Inbuilt max and min functions can minimize verbose if statements.

Reducing a verbose nested-if with the min function in Neutralinojs, a screenshot by the author

How can we compute the number of pages needed to display known data? In such instances, the ceil function is often utilized.

import math as m
results = 102
items_per_page = 10 
pages = m.ceil(results / items_per_page)
print(pages)

Learn to write clear, concise math expressions.

Combinatorics in Algorithm Design

Combinatorics theory counts, selects, and arranges numbers or objects. First, consider these programming-related questions. Four-digit PIN security? what options exist? What if the PIN has a prefix? How to locate all decimal number pairs?

Combinatorics questions. Software engineering jobs often require counting items. Combinatorics counts elements without counting them one by one or through other verbose approaches, therefore it enables us to offer minimum and efficient solutions to real-world situations. Combinatorics helps us make reliable decision tests without missing edge cases. Write a program to see if three inputs form a triangle. This is a question I commonly ask in software engineering interviews.

Graph theory is a subfield of combinatorics. Graph theory is used in computerized road maps and social media apps.

Logarithms and Geometry Understanding

Geometry studies shapes, angles, and sizes. Cartesian geometry involves representing geometric objects in multidimensional planes. Geometry is useful for programming. Cartesian geometry is useful for vector graphics, game development, and low-level computer graphics. We can simply work with 2D and 3D arrays as plane axes.

GetWindowRect is a Windows GUI SDK geometric object.

GetWindowRect outputs an LPRECT geometric object, a screenshot by the author

High-level GUI SDKs and libraries use geometric notions like coordinates, dimensions, and forms, therefore knowing geometry speeds up work with computer graphics APIs.

How does exponentiation's inverse function work? Logarithm is exponentiation's inverse function. Logarithm helps programmers find efficient algorithms and solve calculations. Writing efficient code involves finding algorithms with logarithmic temporal complexity. Programmers prefer binary search (O(log n)) over linear search (O(n)). Git source specifies O(log n):

The Git codebase defines a function with logarithmic time complexity, a screenshot by the author

Logarithms aid with programming math. Metas Watchman uses a logarithmic utility function to find the next power of two.

A utility function that uses ceil, a screenshot by the author

Employing Mathematical Data Structures

Programmers must know data structures to develop clean, efficient code. Stack, queue, and hashmap are computer science basics. Sets and graphs are discrete arithmetic data structures. Most computer languages include a set structure to hold distinct data entries. In most computer languages, graphs can be represented using neighboring lists or objects.

Using sets as deduped lists is powerful because set implementations allow iterators. Instead of a list (or array), store WebSocket connections in a set.

Most interviewers ask graph theory questions, yet current software engineers don't practice algorithms. Graph theory challenges become obligatory in IT firm interviews.

Recognizing Applications of Recursion

A function in programming isolates input(s) and output(s) (s). Programming functions may have originated from mathematical function theories. Programming and math functions are different but similar. Both function types accept input and return value.

Recursion involves calling the same function inside another function. In its implementation, you'll call the Fibonacci sequence. Recursion solves divide-and-conquer software engineering difficulties and avoids code repetition. I recently built the following recursive Dart code to render a Flutter multi-depth expanding list UI:

Recursion is not the natural linear way to solve problems, hence thinking recursively is difficult. Everything becomes clear when a mathematical function definition includes a base case and recursive call.

Conclusion

Every codebase uses arithmetic operators, relational operators, and expressions. To build mathematical expressions, we typically employ log, ceil, floor, min, max, etc. Combinatorics, geometry, data structures, and recursion help implement algorithms. Unless you operate in a pure mathematical domain, you may not use calculus, limits, and other complex math in daily programming (i.e., a game engine). These principles are fundamental for daily programming activities.

Master the above math fundamentals to build clean, efficient code.

James Brockbank

3 years ago

Canonical URLs for Beginners

Canonicalization and canonical URLs are essential for SEO, and improper implementation can negatively impact your site's performance.

Canonical tags were introduced in 2009 to help webmasters with duplicate or similar content on multiple URLs.

To use canonical tags properly, you must understand their purpose, operation, and implementation.

Canonical URLs and Tags

Canonical tags tell search engines that a certain URL is a page's master copy. They specify a page's canonical URL. Webmasters can avoid duplicate content by linking to the "canonical" or "preferred" version of a page.

How are canonical tags and URLs different? Can these be specified differently?

Tags

Canonical tags are found in an HTML page's head></head> section.

<link rel="canonical" href="https://www.website.com/page/" />

These can be self-referencing or reference another page's URL to consolidate signals.

Canonical tags and URLs are often used interchangeably, which is incorrect.

The rel="canonical" tag is the most common way to set canonical URLs, but it's not the only way.

Canonical URLs

What's a canonical link? Canonical link is the'master' URL for duplicate pages.

In Google's own words:

A canonical URL is the page Google thinks is most representative of duplicate pages on your site.

— Google Search Console Help

You can indicate your preferred canonical URL. For various reasons, Google may choose a different page than you.

When set correctly, the canonical URL is usually your specified URL.

Canonical URLs determine which page will be shown in search results (unless a duplicate is explicitly better for a user, like a mobile version).

Canonical URLs can be on different domains.

Other ways to specify canonical URLs

Canonical tags are the most common way to specify a canonical URL.

You can also set canonicals by:

  • Setting the HTTP header rel=canonical.

  • All pages listed in a sitemap are suggested as canonicals, but Google decides which pages are duplicates.

  • Redirects 301.

Google recommends these methods, but they aren't all appropriate for every situation, as we'll see below. Each has its own recommended uses.

Setting canonical URLs isn't required; if you don't, Google will use other signals to determine the best page version.

To control how your site appears in search engines and to avoid duplicate content issues, you should use canonicalization effectively.

Why Duplicate Content Exists

Before we discuss why you should use canonical URLs and how to specify them in popular CMSs, we must first explain why duplicate content exists. Nobody intentionally duplicates website content.

Content management systems create multiple URLs when you launch a page, have indexable versions of your site, or use dynamic URLs.

Assume the following URLs display the same content to a user:

  1. https://www.website.com/category/product-a/

  2. https://www.website.com/product-a/

  3. https://website.com/product-a/

  4. http://www.website.com/product-a/

  5. http://website.com/product-a/

  6. https://m.website.com/product-a/

  7. https://www.website.com/product-a

  8. https://www.website.com/product-A/

A search engine sees eight duplicate pages, not one.

  • URLs #1 and #2: the CMS saves product URLs with and without the category name.

  • #3, #4, and #5 result from the site being accessible via HTTP, HTTPS, www, and non-www.

  • #6 is a subdomain mobile-friendly URL.

  • URL #7 lacks URL #2's trailing slash.

  • URL #8 uses a capital "A" instead of a lowercase one.

Duplicate content may also exist in URLs like:

https://www.website.com
https://www.website.com/index.php

Duplicate content is easy to create.

Canonical URLs help search engines identify different page variations as a single URL on many sites.

SEO Canonical URLs

Canonical URLs help you manage duplicate content that could affect site performance.

Canonical URLs are a technical SEO focus area for many reasons.

Specify URL for search results

When you set a canonical URL, you tell Google which page version to display.

Which would you click?

https://www.domain.com/page-1/

https://www.domain.com/index.php?id=2

First, probably.

Canonicals tell search engines which URL to rank.

Consolidate link signals on similar pages

When you have duplicate or nearly identical pages on your site, the URLs may get external links.

Canonical URLs consolidate multiple pages' link signals into a single URL.

This helps your site rank because signals from multiple URLs are consolidated into one.

Syndication management

Content is often syndicated to reach new audiences.

Canonical URLs consolidate ranking signals to prevent duplicate pages from ranking and ensure the original content ranks.

Avoid Googlebot duplicate page crawling

Canonical URLs ensure that Googlebot crawls your new pages rather than duplicated versions of the same one across mobile and desktop versions, for example.

Crawl budgets aren't an issue for most sites unless they have 100,000+ pages.

How to Correctly Implement the rel=canonical Tag

Using the header tag rel="canonical" is the most common way to specify canonical URLs.

Adding tags and HTML code may seem daunting if you're not a developer, but most CMS platforms allow canonicals out-of-the-box.

These URLs each have one product.

How to Correctly Implement a rel="canonical" HTTP Header

A rel="canonical" HTTP header can replace canonical tags.

This is how to implement a canonical URL for PDFs or non-HTML documents.

You can specify a canonical URL in your site's.htaccess file using the code below.

<Files "file-to-canonicalize.pdf"> Header add Link "< http://www.website.com/canonical-page/>; rel=\"canonical\"" </Files>

301 redirects for canonical URLs

Google says 301 redirects can specify canonical URLs.

Only the canonical URL will exist if you use 301 redirects. This will redirect duplicates.

This is the best way to fix duplicate content across:

  • HTTPS and HTTP

  • Non-WWW and WWW

  • Trailing-Slash and Non-Trailing Slash URLs

On a single page, you should use canonical tags unless you can confidently delete and redirect the page.

Sitemaps' canonical URLs

Google assumes sitemap URLs are canonical, so don't include non-canonical URLs.

This does not guarantee canonical URLs, but is a best practice for sitemaps.

Best-practice Canonical Tag

Once you understand a few simple best practices for canonical tags, spotting and cleaning up duplicate content becomes much easier.

Always include:

One canonical URL per page

If you specify multiple canonical URLs per page, they will likely be ignored.

Correct Domain Protocol

If your site uses HTTPS, use this as the canonical URL. It's easy to reference the wrong protocol, so check for it to catch it early.

Trailing slash or non-trailing slash URLs

Be sure to include trailing slashes in your canonical URL if your site uses them.

Specify URLs other than WWW

Search engines see non-WWW and WWW URLs as duplicate pages, so use the correct one.

Absolute URLs

To ensure proper interpretation, canonical tags should use absolute URLs.

So use:

<link rel="canonical" href="https://www.website.com/page-a/" />

And not:

<link rel="canonical" href="/page-a/" />

If not canonicalizing, use self-referential canonical URLs.

When a page isn't canonicalizing to another URL, use self-referencing canonical URLs.

Canonical tags refer to themselves here.

Common Canonical Tags Mistakes

Here are some common canonical tag mistakes.

301 Canonicalization

Set the canonical URL as the redirect target, not a redirected URL.

Incorrect Domain Canonicalization

If your site uses HTTPS, don't set canonical URLs to HTTP.

Irrelevant Canonicalization

Canonicalize URLs to duplicate or near-identical content only.

SEOs sometimes try to pass link signals via canonical tags from unrelated content to increase rank. This isn't how canonicalization should be used and should be avoided.

Multiple Canonical URLs

Only use one canonical tag or URL per page; otherwise, they may all be ignored.

When overriding defaults in some CMSs, you may accidentally include two canonical tags in your page's <head>.

Pagination vs. Canonicalization

Incorrect pagination can cause duplicate content. Canonicalizing URLs to the first page isn't always the best solution.

Canonicalize to a 'view all' page.

How to Audit Canonical Tags (and Fix Issues)

Audit your site's canonical tags to find canonicalization issues.

SEMrush Site Audit can help. You'll find canonical tag checks in your website's site audit report.

Let's examine these issues and their solutions.

No Canonical Tag on AMP

Site Audit will flag AMP pages without canonical tags.

Canonicalization between AMP and non-AMP pages is important.

Add a rel="canonical" tag to each AMP page's head>.

No HTTPS redirect or canonical from HTTP homepage

Duplicate content issues will be flagged in the Site Audit if your site is accessible via HTTPS and HTTP.

You can fix this by 301 redirecting or adding a canonical tag to HTTP pages that references HTTPS.

Broken canonical links

Broken canonical links won't be considered canonical URLs.

This error could mean your canonical links point to non-existent pages, complicating crawling and indexing.

Update broken canonical links to the correct URLs.

Multiple canonical URLs

This error occurs when a page has multiple canonical URLs.

Remove duplicate tags and leave one.

Canonicalization is a key SEO concept, and using it incorrectly can hurt your site's performance.

Once you understand how it works, what it does, and how to find and fix issues, you can use it effectively to remove duplicate content from your site.


Canonicalization SEO Myths

You might also like

Jim Siwek

Jim Siwek

3 years ago

In 2022, can a lone developer be able to successfully establish a SaaS product?

Photo by Austin Distel on Unsplash

In the early 2000s, I began developing SaaS. I helped launch an internet fax service that delivered faxes to email inboxes. Back then, it saved consumers money and made the procedure easier.

Google AdWords was young then. Anyone might establish a new website, spend a few hundred dollars on keywords, and see dozens of new paying clients every day. That's how we launched our new SaaS, and these clients stayed for years. Our early ROI was sky-high.

Changing times

The situation changed dramatically after 15 years. Our paid advertising cost $200-$300 for every new customer. Paid advertising takes three to four years to repay.

Fortunately, we still had tens of thousands of loyal clients. Good organic rankings gave us new business. We needed less sponsored traffic to run a profitable SaaS firm.

Is it still possible?

Since selling our internet fax firm, I've dreamed about starting a SaaS company. One I could construct as a lone developer and progressively grow a dedicated customer base, as I did before in a small team.

It seemed impossible to me. Solo startups couldn't afford paid advertising. SEO was tough. Even the worst SaaS startup ideas attracted VC funding. How could I compete with startups that could hire great talent and didn't need to make money for years (or ever)?

The One and Only Way to Learn

After years of talking myself out of SaaS startup ideas, I decided to develop and launch one. I needed to know if a solitary developer may create a SaaS app in 2022.

Thus, I did. I invented webwriter.ai, an AI-powered writing tool for website content, from hero section headlines to blog posts, this year. I soft-launched an MVP in July.

Considering the Issue

Now that I've developed my own fully capable SaaS app for site builders and developers, I wonder if it's still possible. Can webwriter.ai be successful?

I know webwriter.ai's proposal is viable because Jasper.ai and Grammarly are also AI-powered writing tools. With competition comes validation.

To Win, Differentiate

To compete with well-funded established brands, distinguish to stand out to a portion of the market. So I can speak directly to a target user, unlike larger competition.

I created webwriter.ai to help web builders and designers produce web content rapidly. This may be enough differentiation for now.

Budget-Friendly Promotion

When paid search isn't an option, we get inventive. There are more tools than ever to promote a new website.

  • Organic Results

  • on social media (Twitter, Instagram, TikTok, LinkedIn)

  • Marketing with content that is compelling

  • Link Creation

  • Listings in directories

  • references made in blog articles and on other websites

  • Forum entries

The Beginning of the Journey

As I've labored to construct my software, I've pondered a new mantra. Not sure where that originated from, but I like it. I'll live by it and teach my kids:

“Do the work.”

Cory Doctorow

Cory Doctorow

3 years ago

The downfall of the Big Four accounting companies is just one (more) controversy away.

Economic mutual destruction.

Multibillion-dollar corporations never bothered with an independent audit, and they all lied about their balance sheets.

It's easy to forget that the Big Four accounting firms are lousy fraud enablers. Just because they sign off on your books doesn't mean you're not a hoax waiting to erupt.

This is *crazy* Capitalism depends on independent auditors. Rich folks need to know their financial advisers aren't lying. Rich folks usually succeed.

No accounting. EY, KPMG, PWC, and Deloitte make more money consulting firms than signing off on their accounts.

The Big Four sign off on phony books because failing to make friends with unscrupulous corporations may cost them consulting contracts.

The Big Four are the only firms big enough to oversee bankruptcy when they sign off on fraudulent books, as they did for Carillion in 2018. All four profited from Carillion's bankruptcy.

The Big Four are corrupt without any consequences for misconduct. Who can forget when KPMG's top management was fined millions for helping auditors cheat on ethics exams?

Consulting and auditing conflict. Consultants help a firm cover its evil activities, such as tax fraud or wage theft, whereas auditors add clarity to a company's finances. The Big Four make more money from cooking books than from uncooking them, thus they are constantly embroiled in scandals.

If a major scandal breaks, it may bring down the entire sector and substantial parts of the economy. Jim Peterson explains system risk for The Dig.

The Big Four are voluntary private partnerships where accountants invest their time, reputations, and money. If a controversy threatens the business, partners who depart may avoid scandal and financial disaster.

When disaster looms, each partner should bolt for the door, even if a disciplined stay-and-hold posture could weather the storm. This happened to Arthur Andersen during Enron's collapse, and a 2006 EU report recognized the risk to other corporations.

Each partner at a huge firm knows how much dirty laundry they've buried in the company's garden, and they have well-founded suspicions about what other partners have buried, too. When someone digs, everyone runs.

If a firm confronts substantial litigation damages or enforcement penalties, it could trigger the collapse of one of the Big Four. That would be bad news for the firm's clients, who would have trouble finding another big auditor.

Most of the world's auditing capacity is concentrated in four enormous, brittle, opaque, compromised organizations. If one of them goes bankrupt, the other three won't be able to take on its clients.

Peterson: Another collapse would strand many of the world's large public businesses, leaving them unable to obtain audit views for their securities listings and regulatory compliance.

Count Down: The Past, Present, and Uncertain Future of the Big Four Accounting Firms is in its second edition.

https://www.emerald.com/insight/publication/doi/10.1108/9781787147003

Scott Galloway

Scott Galloway

3 years ago

Attentive

From oil to attention.

Oil has been the most important commodity for a century. It's sparked wars. Pearl Harbor was a preemptive strike to guarantee Japanese access to Indonesian oil, and it made desert tribes rich. Oil's heyday is over. From oil to attention.

We talked about an information economy. In an age of abundant information, what's scarce? Attention. Scale of the world's largest enterprises, wealth of its richest people, and power of governments all stem from attention extraction, monetization, and custody.

Attention-grabbing isn't new. Humans have competed for attention and turned content into wealth since Aeschylus' Oresteia. The internal combustion engine, industrial revolutions in mechanization and plastics, and the emergence of a mobile Western lifestyle boosted oil. Digitization has put wells in pockets, on automobile dashboards, and on kitchen counters, drilling for attention.

The most valuable firms are attention-seeking enterprises, not oil companies. Big Tech dominates the top 4. Tech and media firms are the sheikhs and wildcatters who capture our attention. Blood will flow as the oil economy rises.

Attention to Detail

More than IT and media companies compete for attention. Podcasting is a high-growth, low-barrier-to-entry chance for newbies to gain attention and (for around 1%) make money. Conferences are good for capturing in-person attention. Salesforce paid $30 billion for Slack's dominance of workplace attention, while Spotify is transforming music listening attention into a media platform.

Conferences, newsletters, and even music streaming are artisan projects. Even 130,000-person Comic Con barely registers on the attention economy's Richter scale. Big players have hundreds of millions of monthly users.

Supermajors

Even titans can be disrupted in the attention economy. TikTok is fracking king Chesapeake Energy, a rule-breaking insurgent with revolutionary extraction technologies. Attention must be extracted, processed, and monetized. Innovators disrupt the attention economy value chain.

Attention pre-digital Entrepreneurs commercialized intriguing or amusing stuff like a newspaper or TV show through subscriptions and ads. Digital storage and distribution's limitless capacity drove the initial wave of innovation. Netflix became dominant by releasing old sitcoms and movies. More ad-free content gained attention. By 2016, Netflix was greater than cable TV. Linear scale, few network effects.

Social media introduced two breakthroughs. First, users produced and paid for content. Netflix's economics are dwarfed by TikTok and YouTube, where customers create the content drill rigs that the platforms monetize.

Next, social media businesses expanded content possibilities. Twitter, Facebook, and Reddit offer traditional content, but they transform user comments into more valuable (addictive) emotional content. By emotional resonance, I mean they satisfy a craving for acceptance or anger us. Attention and emotion are mined from comments/replies, piss-fights, and fast-brigaded craziness. Exxon has turned exhaust into heroin. Should we be so linked without a commensurate presence? You wouldn't say this in person. Anonymity allows fraudulent accounts and undesirable actors, which platforms accept to profit from more pollution.

FrackTok

A new entrepreneur emerged as ad-driven social media anger contaminated the water table. TikTok is remaking the attention economy. Short-form video platform relies on user-generated content, although delivery is narrower and less social.

Netflix grew on endless options. Choice requires cognitive effort. TikTok is the least demanding platform since TV. App video plays when opened. Every video can be skipped with a swipe. An algorithm watches how long you watch, what you finish, and whether you like or follow to create a unique streaming network. You can follow creators and respond, but the app is passive. TikTok's attention economy recombination makes it apex predator. The app has more users than Facebook and Instagram combined. Among teens, it's overtaking the passive king, TV.

Externalities

Now we understand fossil fuel externalities. A carbon-based economy has harmed the world. Fracking brought large riches and rebalanced the oil economy, but at a cost: flammable water, earthquakes, and chemical leaks.

TikTok has various concerns associated with algorithmically generated content and platforms. A Wall Street Journal analysis discovered new accounts listed as belonging to 13- to 15-year-olds would swerve into rabbitholes of sex- and drug-related films in mere days. TikTok has a unique externality: Chinese Communist Party ties. Our last two presidents realized the relationship's perils. Concerned about platform's propaganda potential.

No evidence suggests the CCP manipulated information to harm American interests. A headjack implanted on America's youth, who spend more time on TikTok than any other network, connects them to a neural network that may be modified by the CCP. If the product and ownership can't be separated, the app should be banned. Putting restrictions near media increases problems. We should have a reciprocal approach with China regarding media firms. Ban TikTok

It was a conference theme. I anticipated Axel Springer CEO Mathias Döpfner to say, "We're watching them." (That's CEO protocol.) TikTok should be outlawed in every democracy as an espionage tool. Rumored regulations could lead to a ban, and FCC Commissioner Brendan Carr pushes for app store prohibitions. Why not restrict Chinese propaganda? Some disagree: Several renowned tech writers argued my TikTok diatribe last week distracted us from privacy and data reform. The situation isn't zero-sum. I've warned about Facebook and other tech platforms for years. Chewing gum while walking is possible.

The Future

Is TikTok the attention-economy titans' final evolution? The attention economy acts like it. No original content. CNN+ was unplugged, Netflix is losing members and has lost 70% of its market cap, and households are canceling cable and streaming subscriptions in historic numbers. Snap Originals closed in August after YouTube Originals in January.

Everyone is outTik-ing the Tok. Netflix debuted Fast Laughs, Instagram Reels, YouTube Shorts, Snap Spotlight, Roku The Buzz, Pinterest Watch, and Twitter is developing a TikTok-like product. I think they should call it Vine. Just a thought.

Meta's internal documents show that users spend less time on Instagram Reels than TikTok. Reels engagement is dropping, possibly because a third of the videos were generated elsewhere (usually TikTok, complete with watermark). Meta has tried to downrank these videos, but they persist. Users reject product modifications. Kim Kardashian and Kylie Jenner posted a meme urging Meta to Make Instagram Instagram Again, resulting in 312,000 signatures. Mark won't hear the petition. Meta is the fastest follower in social (see Oculus and legless hellscape fever nightmares). Meta's stock is at a five-year low, giving those who opposed my demands to break it up a compelling argument.

Blue Pill

TikTok's short-term dominance in attention extraction won't be stopped by anyone who doesn't hear Hail to the Chief every time they come in. Will TikTok still be a supermajor in five years? If not, YouTube will likely rule and protect Kings Landing.

56% of Americans regularly watch YouTube. Compared to Facebook and TikTok, 95% of teens use Instagram. YouTube users upload more than 500 hours of video per minute, a number that's likely higher today. Last year, the platform garnered $29 billion in advertising income, equivalent to Netflix's total.

Business and biology both value diversity. Oil can be found in the desert, under the sea, or in the Arctic. Each area requires a specific ability. Refiners turn crude into gas, lubricants, and aspirin. YouTube's variety is unmatched. One-second videos to 12-hour movies. Others are studio-produced. (My Bill Maher appearance was edited for YouTube.)

You can dispute in the comment section or just stream videos. YouTube is used for home improvement, makeup advice, music videos, product reviews, etc. You can load endless videos on a topic or creator, subscribe to your favorites, or let the suggestion algo take over. YouTube relies on user content, but it doesn't wait passively. Strategic partners advise 12,000 creators. According to a senior director, if a YouTube star doesn’t post once week, their manager is “likely to know why.”

YouTube's kevlar is its middle, especially for creators. Like TikTok, users can start with low-production vlogs and selfie videos. As your following expands, so does the scope of your production, bringing longer videos, broadcast-quality camera teams and performers, and increasing prices. MrBeast, a YouTuber, is an example. MrBeast made gaming videos and YouTube drama comments.

Donaldson's YouTube subscriber base rose. MrBeast invests earnings to develop impressive productions. His most popular video was a $3.5 million Squid Game reenactment (the cost of an episode of Mad Men). 300 million people watched. TikTok's attention-grabbing tech is too limiting for this type of material. Now, Donaldson is focusing on offline energy with a burger restaurant and cloud kitchen enterprise.

Steps to Take

Rapid wealth growth has externalities. There is no free lunch. OK, maybe caffeine. The externalities are opaque, and the parties best suited to handle them early are incentivized to construct weapons of mass distraction to postpone and obfuscate while achieving economic security for themselves and their families. The longer an externality runs unchecked, the more damage it causes and the more it costs to fix. Vanessa Pappas, TikTok's COO, didn't shine before congressional hearings. Her comms team over-consulted her and said ByteDance had no headquarters because it's scattered. Being full of garbage simply promotes further anger against the company and the awkward bond it's built between the CCP and a rising generation of American citizens.

This shouldn't distract us from the (still existent) harm American platforms pose to our privacy, teenagers' mental health, and civic dialogue. Leaders of American media outlets don't suffer from immorality but amorality, indifference, and dissonance. Money rain blurs eyesight.

Autocratic governments that undermine America's standing and way of life are immoral. The CCP has and will continue to use all its assets to harm U.S. interests domestically and abroad. TikTok should be spun to Western investors or treated the way China treats American platforms: kicked out.

So rich,