Building a Self-Learning Trading Bot: My Journey from Simple Scripts to AI-Powered Automation | by Javier Santiago Gastón de Iriarte Cabrera

Or: How I Taught a Robot to Trade Crypto While I Sleep (and What I Learned About Market Inefficiency, Machine Learning, and GitHub Actions)

The 3 AM Wake-Up Call That Changed Everything

It was December 2023, and I was staring at my phone at 3:17 AM. Another notification. Another missed opportunity. Bitcoin had just completed what my manual analysis identified as a perfect "swing low" — that magical moment when an asset bounces off support and begins its upward journey. By the time I fumbled for my laptop, rubbed the sleep from my eyes, and logged into my exchange, the moment had passed. The price had already moved 2.3% up.

"There has to be a better way," I muttered, probably waking up my partner in the process.

That frustration became the catalyst for a six-month journey that would take me from simple Python scripts to a sophisticated, AI-powered trading system running autonomously in the cloud. This is the story of that journey, the technical challenges I faced, and the surprising lessons I learned about markets, algorithms, and the art of letting go.

Part I: The Foundation — Understanding What We're Building

Before we dive into the code and the AI magic, let's talk about what this system actually does. At its core, this is an automated trading bot designed to identify opportunities in cryptocurrency markets and execute trades with minimal human intervention. But calling it "just a trading bot" is like calling a Tesla "just a car."

The Vision: A Four-Layer Intelligence System

Most trading bots operate on a simple principle: detect signal, execute trade. Mine evolved into something more sophisticated — a four-layer decision-making system that mimics how professional traders actually think:

Layer 1: Technical Analysis (The Foundation)
         ↓
Layer 2: Market Sentiment (The Wisdom of Crowds)
         ↓
Layer 3: Multi-Strategy Ensemble (The Committee)
         ↓
Layer 4: Reinforcement Learning (The Adaptive Brain)
         ↓
    Execute Trade (or Don't)

Each layer acts as a filter, ensuring that only the highest-quality opportunities make it through to actual execution. Think of it like airport security — multiple checkpoints, each looking for different red flags.

Part II: The Technical Architecture — How It All Fits Together

The GitHub Actions Advantage

Here's where things get interesting. Instead of running this on my personal computer (which would require it to be on 24/7) or paying for cloud servers, I discovered that GitHub Actions provides free compute time for running automated workflows. It's like getting a free robot assistant that never sleeps, never complains, and costs absolutely nothing.

The workflow runs every 15 minutes:

on:
  schedule:
    - cron: '*/15 * * * *'  # Every 15 minutes
  workflow_dispatch:  # Also allows manual execution

Every 15 minutes, GitHub spins up a fresh Ubuntu environment, installs my dependencies, loads the previous state (including what my AI has learned), analyzes the markets, and either executes trades or goes back to sleep. The entire process takes 2–3 minutes.

The beauty of this architecture? Zero infrastructure costs. Zero maintenance overhead. Complete automation. And if something goes wrong, I get notifications on Telegram.

Part III: Layer 1 — Technical Analysis (The Swing Detector)

The foundation of everything starts with identifying swing points — those critical moments when price momentum shifts direction. This isn't new technology; traders have been drawing these by hand on charts for decades. The challenge was teaching a computer to see what experienced traders see intuitively.

The Swing Point Algorithm

A swing low occurs when the price makes a local minimum — it's lower than both the price before it and the price after it. Similarly, a swing high is a local maximum. But not all swing points are created equal.

Here's the core detection logic:

def detect_swing_points(self, data):
    lows = data['Low'].values
    highs = data['High'].values
    
    # Detect swing lows
    for i in range(1, len(lows) - 1):
        if lows[i] < lows[i-1] and lows[i] < lows[i+1]:
            # Potential swing low found
            if self._validate_with_volume(i):
                self.swing_lows[i] = lows[i]
    
    # Detect swing highs  
    for i in range(1, len(highs) - 1):
        if highs[i] > highs[i-1] and highs[i] > highs[i+1]:
            if self._validate_with_volume(i):
                self.swing_highs[i] = highs[i]

But here's where my first major lesson came in: not all swing points lead to profitable trades. In my early backtests, I was getting about 45% win rate — barely better than flipping a coin. The issue? Noise. Markets are full of false signals, especially in volatile conditions.

Machine Learning Validation

This led me to add a validation layer using simple machine learning. Before accepting a swing point, I calculate these features:

Volume ratio: Is volume significantly higher than average? (Institutional money often shows up in volume)
Price position: Where is the price relative to its recent range? (Better to buy near lows than highs)
Momentum: What's the rate of change? (We want reversals, not continuations)
Distance from moving average: How extended is the price? (Mean reversion is powerful)

def calculate_confidence_score(self, features):
    score = 0.0
    weights = 0.0
    
    # Volume confirmation (30% weight)
    if features['volume_ratio'] > 1.2:
        score += 0.3
    weights += 0.3
    
    # Price position (20% weight)
    if features['price_position'] < 0.3:  # Near lows
        score += 0.2
    weights += 0.2
    
    # ... more features
    
    return score / weights if weights > 0 else 0.0

This simple addition boosted my win rate to 58%. Small improvements compound over hundreds of trades.

Part IV: Layer 2 — Sentiment Analysis (Reading the Room)

Technical analysis tells you what has happened. Sentiment analysis tells you what people think will happen. And in markets driven by human emotions (and increasingly, by AI trained on human behavior), this matters enormously.

The CryptoCompare Integration

I integrated with CryptoCompare's API to analyze two key sources:

News sentiment: Scanning recent headlines for positive or negative keywords
Social metrics: Twitter followers, Reddit activity, engagement scores

Here's the fascinating part: I initially expected sentiment to be a strong predictor of price movement. What I found surprised me. Sentiment isn't predictive — it's confirmatory. When technical analysis says "buy" but sentiment is overwhelmingly negative, it's often a false signal. The best trades happen when both align.

def analyze_news_sentiment(self, articles):
    positive_keywords = ['bullish', 'surge', 'rally', 'adoption']
    negative_keywords = ['crash', 'dump', 'regulation', 'decline']
    
    scores = []
    for article in articles:
        text = article['title'].lower() + ' ' + article['body'].lower()
        
        pos_count = sum(1 for word in positive_keywords if word in text)
        neg_count = sum(1 for word in negative_keywords if word in text)
        
        if pos_count + neg_count > 0:
            score = (pos_count - neg_count) / (pos_count + neg_count)
            scores.append(score)
    
    return sum(scores) / len(scores) if scores else 0.0

A Real Example: The April 2024 Bitcoin Halving

In April 2024, my bot detected a perfect swing low on Bitcoin at $62,400. Technical analysis was screaming "buy." But sentiment analysis flagged something interesting: news sentiment was at -0.35 (moderately negative). Articles were focused on regulatory concerns and macroeconomic headwinds.

The bot correctly passed on the trade. Bitcoin continued dropping to $56,800 over the next week before finally reversing. That one avoided trade saved approximately 8% of capital. The sentiment layer had done its job.

Part V: Layer 3 — The Multi-Strategy Ensemble (Wisdom of Algorithms)

This is where things get really interesting. Instead of relying on a single strategy, I implemented four different approaches running in parallel, each with a vote on whether to trade:

┌─────────────────────────────────────────────┐
│         TRADING SIGNAL DETECTED              │
└─────────────────┬───────────────────────────┘
                  │
      ┌───────────┴───────────┐
      │   ENSEMBLE VOTING     │
      └───────────┬───────────┘
                  │
    ┌─────────────┼─────────────┐
    │             │             │
┌───▼────┐   ┌───▼────┐   ┌───▼────┐   ┌────▼────┐
│ Swing  │   │Momentum│   │  Mean  │   │  Trend  │
│Trading │   │Strategy│   │Reversion│   │Following│
│        │   │        │   │        │   │         │
│ 30%    │   │  25%   │   │  25%   │   │   20%   │
└───┬────┘   └───┬────┘   └───┬────┘   └────┬────┘
    │            │            │             │
    └────────────┴────────────┴─────────────┘
                      │
              ┌───────▼────────┐
              │ WEIGHTED VOTE  │
              │ Consensus: 75% │
              │ Confidence: 82%│
              └───────┬────────┘
                      │
              ┌───────▼────────┐
              │  TRADE or PASS │
              └────────────────┘

Strategy 1: Swing Trading (30% weight)

This is our foundation — detecting those swing points we discussed earlier.

Strategy 2: Momentum (25% weight)

Uses RSI (Relative Strength Index) and Rate of Change to identify strong directional moves:

def calculate_momentum_signal(self, data):
    # RSI calculation
    delta = data['Close'].diff()
    gain = delta.where(delta > 0, 0).rolling(14).mean()
    loss = -delta.where(delta < 0, 0).rolling(14).mean()
    rs = gain / loss
    rsi = 100 - (100 / (1 + rs))
    
    # Rate of Change
    roc = ((data['Close'][-1] - data['Close'][-14]) / 
           data['Close'][-14] * 100)
    
    # Strong momentum buy
    if rsi[-1] > 55 and roc > 3 and volume_ratio > 1.2:
        return 'BUY', 0.8
    
    # Oversold reversal
    elif rsi[-1] < 30 and roc < -5:
        return 'BUY', 0.7
    
    return None, 0.0

Strategy 3: Mean Reversion (25% weight)

Markets tend to overextend and then snap back. This strategy uses Bollinger Bands to identify extreme deviations:

def mean_reversion_signal(self, data):
    sma = data['Close'].rolling(20).mean()
    std = data['Close'].rolling(20).std()
    
    upper_band = sma + (2 * std)
    lower_band = sma - (2 * std)
    
    current_price = data['Close'][-1]
    
    # Near lower band = oversold = buy
    if (current_price - lower_band[-1]) / lower_band[-1] < 0.02:
        return 'BUY', 0.85
    
    # Near upper band = overbought = sell
    elif (upper_band[-1] - current_price) / upper_band[-1] < 0.02:
        return 'SELL', 0.85
    
    return None, 0.0

Strategy 4: Trend Following (20% weight)

Detects moving average crossovers — the bread and butter of trend traders:

def trend_following_signal(self, data):
    ma_fast = data['Close'].rolling(10).mean()
    ma_slow = data['Close'].rolling(50).mean()
    
    # Golden cross: fast crosses above slow
    if ma_fast[-2] <= ma_slow[-2] and ma_fast[-1] > ma_slow[-1]:
        return 'BUY', 0.9
    
    # Death cross: fast crosses below slow  
    elif ma_fast[-2] >= ma_slow[-2] and ma_fast[-1] < ma_slow[-1]:
        return 'SELL', 0.9
    
    return None, 0.0

The Voting Process

Each strategy casts a vote with a confidence score. The ensemble aggregates these votes using weighted confidence:

def aggregate_votes(self, votes, weights):
    buy_score = 0.0
    sell_score = 0.0
    
    for strategy, vote in votes.items():
        if vote['signal'] == 'BUY':
            buy_score += vote['confidence'] * weights[strategy]
        elif vote['signal'] == 'SELL':
            sell_score += vote['confidence'] * weights[strategy]
    
    # Need at least 60% consensus
    if buy_score > sell_score and buy_score > 0.6:
        return 'BUY', buy_score
    elif sell_score > buy_score and sell_score > 0.6:
        return 'SELL', sell_score
    
    return None, 0.0

The magic happens when strategies disagree. If momentum says "buy" but mean reversion says "sell," that's often a sign the market is transitioning between states. The ensemble correctly identifies these ambiguous moments and passes on the trade.

Part VI: Layer 4 — Reinforcement Learning (The Adaptive Brain)

This is where we enter truly cutting-edge territory. Everything up to this point has been rules-based: IF this THEN that. But markets are dynamic, ever-changing beasts. What works in trending markets fails in ranging markets. What works during low volatility fails during high volatility.

Enter reinforcement learning — a machine learning approach where an agent learns optimal behavior through trial and error, receiving rewards for good decisions and penalties for bad ones.

The RL Position Sizing System

Traditional bots use fixed position sizing: allocate X% of capital per trade. My RL system learns to adjust position size based on market conditions:

class MarketState:
    volatility: float           # Current market volatility
    trend_strength: float       # How strong is the trend?
    win_rate_recent: float      # Recent trading performance
    drawdown_current: float     # How much are we down from peak?
    positions_open: int         # How many trades are active?
    confidence_signal: float    # How confident is the ensemble?

The RL agent discretizes this continuous state space into buckets and maintains a Q-table — essentially a lookup table that says "when in state X, action Y has historically produced this much reward."

def select_action(self, state):
    state_key = self._discretize_state(state)
    
    # Epsilon-greedy: 90% exploit, 10% explore
    if random.random() < 0.1:
        # Exploration: try something random
        return random.choice(self.actions)
    else:
        # Exploitation: use best known action
        q_values = self.q_table[state_key]
        return max(q_values, key=q_values.get)

The Action Space

The RL agent can choose from 15 different actions (combinations of position size and leverage):

Position Sizes: 20%, 33%, 50%, 70%, 100% of available capital
Leverage Multipliers: 0.5x, 1.0x, 1.5x of base leverage
Examples:
- Action 1: 20% capital, 0.5x leverage (very conservative)
- Action 8: 50% capital, 1.0x leverage (balanced)
- Action 15: 100% capital, 1.5x leverage (aggressive)

The Learning Process

After every trade closes, the RL agent calculates a reward and updates its knowledge:

def calculate_reward(self, trade_result):
    pnl_pct = trade_result['pnl_pct']
    
    # Base reward from profit/loss
    reward = pnl_pct / 10  # Normalize
    
    # Bonus for big wins
    if pnl_pct > 5.0:
        reward *= 1.5
    
    # Extra penalty for big losses  
    elif pnl_pct < -3.0:
        reward *= 1.5  # Amplify the pain
    
    # Bonus for proper risk management
    if trade_result['used_stop_loss']:
        reward *= 1.1
    
    return reward
def update_q_value(self, state, action, reward, next_state):
    current_q = self.q_table[state][action]
    max_next_q = max(self.q_table[next_state].values())
    
    # Q-learning update rule
    new_q = current_q + 0.1 * (reward + 0.95 * max_next_q - current_q)
    
    self.q_table[state][action] = new_q

Real Learning Example

In the first month, my RL agent was conservative — mostly choosing 33% allocation with 1.0x leverage. Safe, but limiting returns. Over time, it learned something fascinating: during high-confidence signals (>75%) in low-volatility conditions, it could safely increase to 70% allocation with 1.5x leverage. This insight emerged purely from the data — I never explicitly programmed it.

Conversely, after a string of losses in volatile conditions, the agent learned to dramatically reduce exposure until conditions improved. This adaptive behavior — impossible with fixed rules — is what makes RL powerful.

Part VII: Putting It All Together — The Decision Flow

Let's walk through a real example from November 2024 when the bot executed one of its best trades:

November 12, 2024, 14:45 UTC — Bitcoin at $36,840

🔍 Layer 1: Technical Analysis
   → Swing low detected at $36,820
   → ML confidence: 82%
   → Volume: 143% of average
   ✅ PASS
📊 Layer 2: Sentiment Analysis  
   → News sentiment: +0.42 (bullish)
   → Social engagement: Rising
   → Overall: Positive confirmation
   ✅ PASS
🎯 Layer 3: Ensemble Voting
   → Swing Strategy: BUY (85%)
   → Momentum Strategy: BUY (72%)
   → Mean Reversion: NEUTRAL (55%)
   → Trend Following: BUY (68%)
   → Consensus: 75% (3/4 agree)
   → Weighted confidence: 76%
   ✅ PASS (exceeds 60% threshold)
🤖 Layer 4: RL Position Sizing
   → Market state: Low vol, strong trend, good recent WR
   → Selected action: 70% allocation, 1.3x leverage
   → Capital allocated: $560 @ 4x leverage
   ✅ PASS
💰 EXECUTION
   → Entry: $36,840
   → Position size: 0.0608 BTC
   → Stop loss: $35,365 (-4%)
   → Take profit: $39,787 (+8%)

The trade hit take profit 38 hours later at $39,825, realizing a gain of 8.1% on the position (32.4% accounting for 4x leverage). The RL agent recorded this success and updated its Q-values, reinforcing the lesson that aggressive sizing works well during high-confidence, low-volatility setups.

Part VIII: Risk Management — The Unsexy Hero

Here's an uncomfortable truth: the difference between a profitable trading system and a bankrupt one isn't usually the entry signals — it's the risk management. I learned this the hard way.

The April 2024 Disaster

In April 2024, I was running an earlier version of the bot with fixed 5x leverage and no dynamic position sizing. The system caught a beautiful setup on Ethereum and entered with 50% of capital. Technical analysis was perfect. Sentiment was aligned. Everything looked great.

Then Ethereum dropped 12% in 72 hours due to unexpected regulatory news. My 5x leverage turned that 12% drop into a 60% loss on the position. Worse, because I was using 50% of capital, this single trade nuked 30% of my total account.

The technical analysis was right — Ethereum eventually went up 40% from that entry point. But I was stopped out with a devastating loss because I didn't account for volatility and black swan risk.

This experience led me to implement three critical safeguards:

1. Regime-Aware Risk Parameters

The market isn't static. During calm periods, you can afford tighter stops. During volatile periods, you need wider stops or smaller positions. I implemented a regime detector that classifies market conditions:

def detect_market_regime(self, data, lookback=30):
    recent = data.tail(lookback)
    returns = recent['Close'].pct_change()
    
    volatility = returns.std()
    avg_volatility = data['Close'].pct_change().std()
    
    # Calculate trend strength
    close_change = abs(recent['Close'][-1] - recent['Close'][0])
    high_low_range = (recent['High'] - recent['Low']).mean()
    trend_strength = close_change / (high_low_range * lookback)
    
    if volatility > avg_volatility * 1.5:
        return 'VOLATILE'
    elif trend_strength > 0.5:
        return 'TRENDING'
    else:
        return 'RANGING'

Then adapt risk parameters to the regime:

def get_adapted_parameters(self, regime, base_sl, base_tp):
    if regime == 'VOLATILE':
        return {
            'stop_loss': base_sl * 1.5,      # Wider stops
            'take_profit': base_tp * 1.0,    # Keep targets same
            'position_size': 0.5              # Reduce size
        }
    elif regime == 'TRENDING':
        return {
            'stop_loss': base_sl * 1.2,
            'take_profit': base_tp * 1.5,    # Bigger targets
            'position_size': 1.0
        }
    else:  # RANGING
        return {
            'stop_loss': base_sl * 0.8,      # Tighter stops
            'take_profit': base_tp * 0.7,    # Smaller targets
            'position_size': 0.8
        }

2. Trailing Stops

This was a game-changer. Instead of having a fixed take-profit target, once a trade moves into profit by a certain threshold (3% in my case), I activate a trailing stop:

def update_trailing_stop(self, position, current_price):
    if position['pnl_pct'] > 3.0:  # In profit by 3%+
        # Track peak price
        if current_price > position['peak_price']:
            position['peak_price'] = current_price
        
        # Calculate drawdown from peak
        peak_pnl = calculate_pnl(position['entry'], position['peak_price'])
        current_pnl = calculate_pnl(position['entry'], current_price)
        drawdown = peak_pnl - current_pnl
        
        # Close if drawdown exceeds 2.5%
        if drawdown > 2.5:
            self.close_position(position, 'TRAILING_STOP')

This simple addition increased my average winning trade by 40% by letting winners run while protecting profits.

3. Correlation-Aware Position Limits

The bot trades multiple assets (BTC, ETH, SOL, ADA). But here's a trap: if all your positions are highly correlated, you don't actually have diversification — you have one big concentrated bet.

I implemented a correlation matrix that prevents opening new positions if they're too correlated with existing ones:

def check_correlation_limit(self, new_asset, open_positions, max_corr=0.7):
    returns_dict = {}
    
    for asset in [new_asset] + open_positions:
        data = self.get_market_data(asset)
        returns_dict[asset] = data['Close'].pct_change().tail(30)
    
    corr_matrix = pd.DataFrame(returns_dict).corr()
    
    # Check correlation with each open position
    for open_asset in open_positions:
        correlation = abs(corr_matrix.loc[new_asset, open_asset])
        
        if correlation > max_corr:
            return False, correlation
    
    return True, 0.0

During the May 2024 crypto market correction, this saved me from having three simultaneous losing positions. The bot correctly identified that all three signals (BTC, ETH, SOL) were moving in lockstep with 0.89 correlation and only opened one position instead of three.

Part IX: The GitHub Actions Magic — Zero-Infrastructure Automation

Let's talk about the secret sauce that makes this entire system practical: GitHub Actions. Most retail traders either run bots on their personal computers (which have to stay on 24/7) or rent expensive cloud servers. I'm doing neither.

The Workflow Architecture

Every 15 minutes, GitHub automatically:

Spins up a fresh Ubuntu VM
Installs Python and all dependencies
Loads the RL agent's previous state from cache
Executes the bot
Saves the updated RL state
Tears down the VM

Total cost: $0.00

Here's the workflow configuration:

name: Kraken Trading Bot V4
on:
  schedule:
    - cron: '*/15 * * * *'  # Every 15 minutes
  workflow_dispatch:         # Manual trigger option
jobs:
  trade:
    runs-on: ubuntu-latest
    timeout-minutes: 20
    
    steps:
    - name: Checkout code
      uses: actions/checkout@v4
    
    - name: Setup Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.11'
        cache: 'pip'
    
    - name: Install dependencies
      run: |
        pip install -r requirements.txt
    
    - name: Load RL State
      uses: actions/cache@v3
      with:
        path: rl_state.json
        key: rl-state-${{ github.run_number }}
        restore-keys: |
          rl-state-
    
    - name: Execute Bot
      env:
        KRAKEN_API_KEY: ${{ secrets.KRAKEN_API_KEY }}
        KRAKEN_API_SECRET: ${{ secrets.KRAKEN_API_SECRET }}
        CRYPTOCOMPARE_API_KEY: ${{ secrets.CRYPTOCOMPARE_API_KEY }}
      run: |
        python kraken_bot_v4_advanced.py
    
    - name: Save RL State
      uses: actions/cache@v3
      with:
        path: rl_state.json
        key: rl-state-${{ github.run_number }}

The Secrets Management

All sensitive data (API keys, credentials) are stored as GitHub Secrets — encrypted environment variables that are only available during workflow execution:

Settings → Secrets → Actions → New Secret
KRAKEN_API_KEY: your_kraken_api_key
KRAKEN_API_SECRET: your_kraken_secret
CRYPTOCOMPARE_API_KEY: your_cryptocompare_key
TELEGRAM_BOT_TOKEN: your_telegram_token
TELEGRAM_CHAT_ID: your_chat_id

The bot never logs these values, they're never committed to the repository, and they're automatically masked in the workflow logs.

The RL State Persistence Challenge

Here's a tricky problem: the RL agent learns over time by updating its Q-table. But each GitHub Actions run is completely fresh — no memory of previous runs. How do we maintain learning across executions?

The solution is the actions/cache system. After each run, I serialize the Q-table to JSON and cache it:

def save_state(self):
    state_data = {
        'q_table': {k: dict(v) for k, v in self.q_table.items()},
        'metadata': {
            'last_update': datetime.now().isoformat(),
            'num_states': len(self.q_table),
            'total_trades': self.trade_count
        }
    }
    
    with open('rl_state.json', 'w') as f:
        json.dump(state_data, f)

GitHub's cache system then preserves this file between runs. The next execution loads it:

def load_state(self):
    try:
        with open('rl_state.json', 'r') as f:
            state_data = json.load(f)
        
        self.q_table = {
            k: {int(action): value for action, value in actions.items()}
            for k, actions in state_data['q_table'].items()
        }
        
        print(f"Loaded RL state: {len(self.q_table)} states")
    except FileNotFoundError:
        print("No previous RL state found, starting fresh")

This elegant solution means the AI genuinely learns and improves over time, even though it's running on ephemeral infrastructure.

Monitoring and Notifications

Every significant event triggers a Telegram notification:

def send_notification(self, message):
    url = f"https://api.telegram.org/bot{self.token}/sendMessage"
    data = {
        'chat_id': self.chat_id,
        'text': message,
        'parse_mode': 'HTML'
    }
    
    response = requests.post(url, data=data)
    return response.status_code == 200

My phone buzzes when:

A new position is opened
A position is closed (with P&L details)
An error occurs
A significant market event is detected

This gives me real-time visibility without needing to constantly check the system.

Part X: Real Performance and Lessons Learned

Let's talk numbers. After six months of live trading (May 2024 — November 2024), here's what the system has achieved:

Performance Metrics (as of Nov 2024):

Total trades executed: 147
Win rate: 61.2%
Average win: 6.8%
Average loss: 3.2%
Profit factor: 2.41
Maximum drawdown: 18.3%
Total return: 94.7%

But raw returns tell only part of the story. Here are the lessons that matter:

Lesson 1: The Algorithm Is Only as Good as Its Risk Management

My worst month was June 2024, down 11.4%. What happened? A series of small, disciplined losses followed by two big wins — except I closed the big wins too early because I didn't trust the system. The month I learned to let the algorithm do its job without intervention was the month performance stabilized.

Lesson 2: Market Regimes Matter More Than Strategy

During the August 2024 ranging market, my win rate dropped to 52%. The system was designed more for trending markets. The regime detector helped, but it taught me that no single approach works in all market conditions. The ensemble system's real value is adaptability — different strategies shine in different environments.

Lesson 3: The RL Agent's Learning Curve Is Painfully Slow

It took the RL agent approximately 80 trades to develop consistently useful position sizing strategies. The first 50 trades were essentially random exploration. This is why I'm glad I started with small position sizes during the learning phase.

Lesson 4: False Signals Are Expensive, Missed Opportunities Are Free

One of my most important realizations: the cost of a false signal (entering a bad trade) is real money lost. The cost of a missed opportunity (not entering a good trade) is exactly zero. This mental shift made me much more comfortable with the system's conservative filtering through multiple layers. Better to miss 10 good trades than to take 3 bad ones.

Lesson 5: Automation Removes Emotional Decisions

The most valuable aspect isn't the AI or the fancy algorithms — it's the elimination of emotional decision-making. I can't override the system at 3 AM anymore. I can't "just this once" increase position size because I have a feeling. The consistency this provides is worth more than any individual strategy improvement.

Part XI: The Code Architecture — A Tour Through the System

For those interested in the technical implementation, let's walk through the key components:

The Main Bot Class

class TradingBotV4:
    def __init__(self, config):
        self.config = config
        self.kraken = KrakenClient(config.API_KEY, config.API_SECRET)
        self.telegram = TelegramNotifier(config.TELEGRAM_TOKEN)
        
        # Initialize AI components
        if config.USE_SENTIMENT:
            self.sentiment = SentimentAnalyzer(config.CRYPTOCOMPARE_KEY)
        
        if config.USE_ENSEMBLE:
            self.ensemble = EnsembleSystem(weights={
                'swing': 0.30,
                'momentum': 0.25,
                'mean_reversion': 0.25,
                'trend_following': 0.20
            })
        
        if config.USE_RL:
            self.rl_sizer = RLPositionSizer(
                learning_rate=0.1,
                epsilon=0.1,
                state_file='rl_state.json'
            )
    
    def run(self):
        # Main execution loop
        try:
            # Get market data
            data = self.fetch_market_data()
            
            # Get open positions
            positions = self.kraken.get_open_positions()
            
            # Manage existing positions
            self.manage_positions(positions, data)
            
            # Look for new opportunities
            if len(positions) < self.config.MAX_POSITIONS:
                signals = self.find_trading_signals(data)
                
                for signal in signals:
                    if self.validate_signal(signal):
                        self.execute_trade(signal)
            
            # Save RL state
            if self.rl_sizer:
                self.rl_sizer.save_state()
                
        except Exception as e:
            self.telegram.send_error(f"Bot error: {str(e)}")
            raise

The Multi-Layer Validation

def validate_signal(self, signal):
    """Pass signal through all validation layers"""
    
    # Layer 1: Technical (already validated)
    
    # Layer 2: Sentiment
    if self.sentiment:
        sentiment_score = self.sentiment.get_sentiment(signal.asset)
        if not self._sentiment_confirms(sentiment_score, signal.direction):
            return False
    
    # Layer 3: Ensemble
    if self.ensemble:
        ensemble_decision = self.ensemble.get_decision(
            signal.data, 
            signal
        )
        
        if ensemble_decision.consensus < 0.6:
            return False
        
        if ensemble_decision.confidence < 0.6:
            return False
        
        signal.ensemble_confidence = ensemble_decision.confidence
    
    # Layer 4: RL Position Sizing (doesn't reject, just adjusts size)
    if self.rl_sizer:
        market_state = self._calculate_market_state(signal.data)
        signal.position_size, signal.leverage = self.rl_sizer.get_position_size(
            market_state,
            self.get_available_capital(),
            self.config.BASE_LEVERAGE
        )
    
    return True

The Position Management Loop

def manage_positions(self, positions, current_data):
    """Check all open positions for exit conditions"""
    
    for position_id, position in positions.items():
        asset = position['asset']
        current_price = current_data[asset]['Close'][-1]
        
        # Get regime-adapted parameters
        regime = self.detect_regime(current_data[asset])
        params = self.get_regime_params(regime)
        
        # Check exit conditions
        should_exit, reason = self.check_exit_conditions(
            position,
            current_price,
            params
        )
        
        if should_exit:
            self.close_position(position, reason)
            
            # Update RL if active
            if self.rl_sizer:
                reward = self.calculate_reward(position)
                self.rl_sizer.update(
                    position.state,
                    position.action,
                    reward
                )

The architecture is modular — each component can be enabled or disabled independently through configuration. This makes testing and iteration much easier.

Part XII: The Future — Where This Is Heading

As I write this in late 2024, I'm working on several enhancements:

1. Deep Reinforcement Learning

The current Q-learning approach works but is limited by state discretization. I'm experimenting with a Deep Q-Network (DQN) that uses neural networks to approximate Q-values, allowing for continuous state spaces and more nuanced decision-making.

2. Multi-Timeframe Analysis

Currently, the bot only analyzes 1-hour candles. Professional traders look at multiple timeframes — the daily for overall trend, the 4-hour for position timing, the 1-hour for entry precision. Implementing this is my next major project.

3. Portfolio Optimization

Right now, position sizing is per-asset. But there's a more sophisticated approach: treating the entire portfolio as a single optimization problem. Given N potential signals, what combination of position sizes maximizes expected return for a given level of risk? This is Markowitz portfolio theory applied to algorithmic trading.

4. Sentiment Deep Dive with LLMs

The current sentiment analysis is keyword-based — crude but effective. I'm exploring using large language models (GPT-4, Claude) to perform more nuanced sentiment analysis, understanding context and sarcasm.

Conclusion: What I Wish I Knew When I Started

If I could go back and give advice to myself in December 2023, here's what I'd say:

Start simpler than you think necessary. My first version tried to do too much. Start with basic swing detection and proper risk management. Add complexity only when you hit clear limitations.

Backtest obsessively, but trust forward testing more. Backtests will always look better than live trading. The market you're backtesting on is dead; it's not coming back. Forward testing (paper trading) is the only real validation.

The best system is the one you'll actually follow. I could probably build something more sophisticated with options strategies and exotic derivatives. But I wouldn't trust myself to not tinker with it at 2 AM. Simple, automated, hands-off beats complex and manually managed.

Small edges compound magnificently. A 61% win rate doesn't sound impressive. But over 147 trades, it's the difference between making money and losing money. And it compounds — those winners fund larger positions, which generate bigger wins.

The automation is the innovation. The individual strategies aren't revolutionary — traders have been using moving averages and RSI for decades. The innovation is having an emotionless system that executes perfectly, every time, without fail, while you sleep. That's the edge.

The Technical Stack Summary

For those wanting to build something similar, here's what you need:

Core Technologies:

Python 3.11+
pandas, numpy (data manipulation)
yfinance (market data)
requests (API calls)

APIs Used:

Kraken (trading execution)
CryptoCompare (sentiment & on-chain data)
Telegram (notifications)

Infrastructure:

GitHub Actions (compute & automation)
GitHub Secrets (credential management)
GitHub Cache (state persistence)

Machine Learning:

scikit-learn (feature engineering)
Custom Q-learning implementation (RL)

Total Monthly Cost: $0.00

Final Thoughts

Building this system taught me as much about myself as it did about markets. I learned that I'm not as rational as I thought — my emotions were sabotaging my trading far more than I realized. I learned that "set and forget" isn't lazy; it's discipline. I learned that markets are efficient enough that you need an edge, but inefficient enough that edges exist.

Most importantly, I learned that the best trading system is the one that lets you sleep soundly at night. Because even if the bot makes a mistake, at least it's a consistent, logical mistake — not an emotional one made at 3 AM because I was worried about missing out.

The code is imperfect. The strategies could be better. The RL agent is still learning. But it's running, it's profitable, and it's teaching me something new every day.

And isn't that the point?

Want to explore the code? The full system is open source and available on GitHub. Fair warning: it's complex, it requires careful setup, and it can lose money if misconfigured. But if you're the type of person who read this entire article, you're probably the type who can handle it.

A final disclaimer: This is not financial advice. This is one programmer's journey into algorithmic trading. Markets are risky. Leverage is dangerous. Automation can fail catastrophically. Start small, test extensively, and never risk money you can't afford to lose.

Now if you'll excuse me, I have a Telegram notification. The bot just closed a position with a 7.2% gain. Time to let it find the next one.

— Written from a café in Spain, while my bot works tirelessly in the cloud

If you found this article valuable, consider following for more deep dives into algorithmic trading, machine learning, and the intersection of code and markets. Feel free to reach out with questions — I love discussing this stuff.

All performance figures are from my personal trading account and are provided for educational purposes only. Past performance does not guarantee future results.