Artificial Intelligence
Python
Web Scraping
NLP
Google Search
Automation

Sasta GPT 3

A 'Zero-Parameter' generative text engine that utilizes Google Search results as a real-time dataset to predict text tokens, mimicking LLM behavior without local weights.

Sasta GPT 3


1. The Challenge

  • Context: Large Language Models (LLMs) like GPT-3 require massive datasets, expensive GPU clusters, and gigabytes of VRAM to store parameters. We wanted to challenge this paradigm with a fun experiment: Could we build a "generative AI" that has zero parameters and requires no training?
  • The Obstacle: The goal was to generate coherent sentences by leveraging the "collective intelligence" of the internet in real-time. The engineering challenge was converting unstructured Google Search results into a structured "next-token prediction" stream, similar to how a Transformer works, but using web scraping instead of matrix multiplication.

2. The Solution Architecture

The application acts as a "Search-Based Markov Chain." Instead of looking up weights in a neural network, it looks up human usage patterns on the web.

  1. Input: The user provides a seed phrase (e.g., "The quick brown fox").
  2. Querying: The system sends this exact phrase to Google Search.
  3. Extraction: It parses the titles and description snippets from the search results.
  4. Prediction: It identifies the word that most frequently appears immediately after the seed phrase in those snippets.
  5. Iteration: The new word is appended to the phrase, and the cycle repeats.

3. Implementation Highlights

A. The "Internet-as-a-Model" Logic

This snippet represents the core engine. It doesn't calculate probabilities from weights; it calculates them from the frequency of words appearing in search snippets.

def predict_next_token(query):
    # Search Google for the current sentence context
    results = google_search(query) 
  
    candidates = []
    for snippet in results:
        # Find the seed phrase in the snippet
        if query.lower() in snippet.lower():
            # Extract the word immediately following the phrase
            after_part = snippet.lower().split(query.lower())[1]
            next_word = after_part.strip().split(' ')[0]
    
            if next_word:
                candidates.append(next_word)

    # The "Softmax": Return the most common word found
    from collections import Counter
    if candidates:
        return Counter(candidates).most_common(1)[0][0]
    return None

B. Handling Context Drift

Since Google Search has a character limit, we can't search the entire paragraph every time. I implemented a "sliding window" context (similar to an LLM's context window) to keep the search queries relevant.

def generate_text(seed_text, iterations=10):
    current_text = seed_text
  
    for _ in range(iterations):
        # Only use the last 5-7 words for the search query to ensure results
        # otherwise Google returns "No results found" for long specific strings
        search_context = " ".join(current_text.split()[-6:])
  
        next_token = predict_next_token(search_context)
  
        if next_token:
            current_text += " " + next_token
            print(f"Generated: {current_text}")
        else:
            break # Stop if no logical next step is found
    
    return current_text

4. Challenges & Overcoming Roadblocks

  • The Trap: The "No Results" Dead End. As the generated sentence became unique (e.g., "The quick brown fox jumps over the lazy developer in Islamabad"), Google would return zero results because that specific sentence had never been written before.
  • The Fix: I implemented a Backoff Strategy. If a search returns no results, the engine drops the first word of the query (shortening the context) and retries. This broadens the search scope, allowing the model to find a valid grammatical continuation even if the specific semantic context is slightly lost.

5. Results & Impact

  • Proof of Concept: The tool successfully generates grammatically correct phrases (e.g., completing idioms or famous quotes) purely by scraping, proving that the internet itself is a giant, uncompressed language model.
  • Cost Efficiency: "Sasta" means cheap. This approach costs $0 in training and hardware, running on a standard CPU.
  • Collaboration: This experimental project was co-authored with Muhammad Mobeen, combining web automation skills with NLP theory.