← Back to API Home

Homester Backend Architecture

Comprehensive technical documentation for the Homester AI-powered real estate platform backend.

Table of Contents

  1. System Overview
  2. Technology Stack
  3. Database Architecture
  4. API Layer Design
  5. AI Integration
  6. Authentication & Security
  7. Data Flow
  8. Deployment Architecture
  9. Performance & Scalability
  10. Development Guidelines

System Overview

The Homester backend is a production-ready Flask application that combines traditional REST API patterns with modern AI capabilities to provide intelligent real estate search and management functionality.

Core Principles

High-Level Architecture

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Frontend │ │ Load Balancer │ │ CDN/Assets │ │ (React/Vue) │────│ (nginx/ALB) │────│ (CloudFront) │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ ┌─────────────────┐ │ │ Flask API │ └──────────────│ (Gunicorn) │ └─────────────────┘ │ ┌────────────┼────────────┐ │ │ │ ┌─────────────────┐ ┌──────────┐ ┌─────────────────┐ │ PostgreSQL │ │ Redis │ │ OpenAI API │ │ (Primary DB) │ │ (Cache) │ │ (AI Services) │ └─────────────────┘ └──────────┘ └─────────────────┘

Technology Stack

Backend Framework

yaml Framework: Flask 2.3+ API Documentation: Flask-RESTX (OpenAPI/Swagger) Web Server: Gunicorn with sync workers ASGI Server: Compatible with Uvicorn for async operations

Database Layer

yaml Primary Database: PostgreSQL 14+ ORM: SQLAlchemy 2.0+ Connection Pooling: SQLAlchemy pool (size: 10, overflow: 20) Migrations: Flask-Migrate (Alembic) Cache Layer: Redis 6+ (optional)

AI & Machine Learning

yaml Language Model: OpenAI GPT-4o Embeddings: text-embedding-3-small Vector Storage: PostgreSQL with JSON columns Similarity Search: Cosine similarity calculation

Security & Authentication

yaml Authentication: JWT with Flask-JWT-Extended Password Hashing: Werkzeug PBKDF2 Rate Limiting: Flask-Limiter CORS: Flask-CORS Input Validation: Custom validators

DevOps & Deployment

yaml Containerization: Docker Orchestration: Kubernetes Process Management: Gunicorn Monitoring: Built-in logging + external tools CI/CD: Docker-based deployment

Database Architecture

Schema Design

The database follows a normalized relational model with JSON fields for flexible data storage:

```sql -- Core user management CREATE TABLE users ( id SERIAL PRIMARY KEY, email VARCHAR(120) UNIQUE NOT NULL, password_hash VARCHAR(256) NOT NULL, first_name VARCHAR(50), last_name VARCHAR(50), phone VARCHAR(20), is_active BOOLEAN DEFAULT TRUE, is_verified BOOLEAN DEFAULT FALSE, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() );

-- Property listings with comprehensive data CREATE TABLE properties ( id SERIAL PRIMARY KEY, mls_id VARCHAR(50) UNIQUE, title VARCHAR(200) NOT NULL, description TEXT, address VARCHAR(500) NOT NULL, city VARCHAR(100) NOT NULL, state VARCHAR(50) NOT NULL, zip_code VARCHAR(10) NOT NULL, country VARCHAR(50) DEFAULT 'USA',

-- Property details
price DECIMAL(12,2) NOT NULL,
bedrooms INTEGER,
bathrooms DECIMAL(3,1),
square_feet INTEGER,
lot_size DECIMAL(8,3),
year_built INTEGER,
property_type VARCHAR(50),
listing_type VARCHAR(20) DEFAULT 'for_sale',

-- Location data
latitude DECIMAL(10,8),
longitude DECIMAL(11,8),

-- Flexible JSON data
features JSONB,           -- amenities, parking, etc.
schools JSONB,            -- nearby schools with ratings
neighborhood_info JSONB,  -- walkability, transit, etc.
image_urls JSONB,         -- array of image URLs

-- Listing information
virtual_tour_url VARCHAR(500),
listing_agent_name VARCHAR(100),
listing_agent_phone VARCHAR(20),
listing_agent_email VARCHAR(120),
listing_date TIMESTAMP WITH TIME ZONE,
status VARCHAR(20) DEFAULT 'active',

-- AI search optimization
embedding_vector TEXT,    -- JSON string of vector embeddings

created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()

);

-- Chat system for AI interactions CREATE TABLE chat_sessions ( id SERIAL PRIMARY KEY, session_id VARCHAR(100) UNIQUE NOT NULL, user_id INTEGER REFERENCES users(id), title VARCHAR(200), is_active BOOLEAN DEFAULT TRUE, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() );

CREATE TABLE chat_messages ( id SERIAL PRIMARY KEY, session_id INTEGER REFERENCES chat_sessions(id) ON DELETE CASCADE, role VARCHAR(20) NOT NULL, -- 'user', 'assistant', 'system' content TEXT NOT NULL, message_metadata JSONB, -- additional context, property IDs, etc. created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() );

-- User favorites for property bookmarking CREATE TABLE user_favorites ( id SERIAL PRIMARY KEY, user_id INTEGER REFERENCES users(id) ON DELETE CASCADE, property_id INTEGER REFERENCES properties(id) ON DELETE CASCADE, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), UNIQUE(user_id, property_id) ); ```

Indexing Strategy

```sql -- Performance optimization indexes CREATE INDEX idx_properties_price ON properties(price); CREATE INDEX idx_properties_bedrooms ON properties(bedrooms); CREATE INDEX idx_properties_bathrooms ON properties(bathrooms); CREATE INDEX idx_properties_location ON properties(city, state); CREATE INDEX idx_properties_type ON properties(property_type); CREATE INDEX idx_properties_status ON properties(status); CREATE INDEX idx_properties_mls ON properties(mls_id);

-- User and session indexes CREATE INDEX idx_users_email ON users(email); CREATE INDEX idx_chat_sessions_user ON chat_sessions(user_id); CREATE INDEX idx_chat_messages_session ON chat_messages(session_id); CREATE INDEX idx_user_favorites_user ON user_favorites(user_id); ```

Data Relationships

Users (1) ──────────── (∞) Chat Sessions │ │ │ └── (∞) Chat Messages │ └── (∞) User Favorites ── (1) Properties

API Layer Design

Flask-RESTX Architecture

The API follows a namespace-based organization pattern using Flask-RESTX for automatic OpenAPI documentation generation:

```python

Application factory pattern

def create_app(): app = Flask(name)

# Initialize extensions
db.init_app(app)
jwt.init_app(app)
CORS(app)
limiter.init_app(app)

# Initialize API with documentation
api = Api(app, doc='/docs/', version='1.0')

# Register namespaces
api.add_namespace(auth_ns, path='/api/auth')
api.add_namespace(properties_ns, path='/api/properties')
api.add_namespace(chat_ns, path='/api/chat')

return app

```

API Namespace Organization

Authentication Namespace (/api/auth)

```python @auth_ns.route('/register') class RegisterResource(Resource): @auth_ns.expect(register_model) @auth_ns.marshal_with(user_response_model) def post(self): """Register a new user""" # Implementation with validation and JWT token generation

@auth_ns.route('/login') class LoginResource(Resource): @auth_ns.expect(login_model) @auth_ns.marshal_with(token_response_model) def post(self): """Authenticate user and return tokens""" # Implementation with password verification and token issuance ```

Properties Namespace (/api/properties)

```python @properties_ns.route('/search') class PropertySearchResource(Resource): @properties_ns.marshal_with(search_response_model) def get(self): """Search properties with filters""" # Advanced filtering with PostgreSQL queries # Integration with vector similarity search

@properties_ns.route('/') class PropertyResource(Resource): @properties_ns.marshal_with(property_model) def get(self, property_id): """Get property by ID""" # Detailed property information with related data ```

Chat Namespace (/api/chat)

python @chat_ns.route('/chat') class ChatResource(Resource): @chat_ns.expect(chat_message_model) @chat_ns.marshal_with(chat_response_model) @limiter.limit("30 per minute") def post(self): """Send a message and get AI response""" # OpenAI integration with property search # Vector similarity matching

Request/Response Models

```python

Automatic validation and documentation

user_model = api.model('User', { 'id': fields.Integer(required=True), 'email': fields.String(required=True), 'first_name': fields.String(), 'last_name': fields.String(), 'created_at': fields.DateTime() })

property_model = api.model('Property', { 'id': fields.Integer(required=True), 'title': fields.String(required=True), 'price': fields.Float(required=True), 'bedrooms': fields.Integer(), 'bathrooms': fields.Float(), 'features': fields.Raw(), # JSON field 'image_urls': fields.List(fields.String()) }) ```

AI Integration

OpenAI Service Architecture

The AI integration follows a service-oriented architecture with clear separation of concerns:

```python class AIService: """Orchestrates AI-powered property search and chat responses"""

def __init__(self):
    self.openai_client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
    self.model = "gpt-4o"  # Latest OpenAI model

@classmethod
def get_chat_response(cls, conversation: List[Dict]) -> Dict:
    """Generate AI response with property recommendations"""
    # 1. Analyze user message with GPT-4o
    # 2. Extract search parameters
    # 3. Perform vector similarity search
    # 4. Fallback to traditional filtering
    # 5. Generate contextualized response

```

Vector Embedding Pipeline

```python class EmbeddingService: """Manages OpenAI embeddings for semantic search"""

def __init__(self):
    self.embedding_model = "text-embedding-3-small"

@classmethod
def generate_embedding(cls, text: str) -> List[float]:
    """Generate embedding vector for text"""
    # OpenAI API call for embedding generation

@classmethod
def search_properties_by_text(cls, query: str, limit: int = 10) -> List[Dict]:
    """Semantic property search using vector similarity"""
    # 1. Generate query embedding
    # 2. Calculate cosine similarity with stored embeddings
    # 3. Return ranked results

```

Property Text Representation

Properties are converted to searchable text that captures all relevant information:

```python def create_property_embedding_text(property_obj: Property) -> str: """Create comprehensive text representation for embedding""" parts = []

# Basic information
parts.append(f"{property_obj.title}")
parts.append(f"{property_obj.description}")

# Location details
parts.append(f"Located in {property_obj.city}, {property_obj.state}")

# Property characteristics
parts.append(f"{property_obj.bedrooms} bedrooms, {property_obj.bathrooms} bathrooms")
parts.append(f"{property_obj.square_feet} square feet")
parts.append(f"{property_obj.property_type}")

# Features and amenities
if property_obj.features:
    for category, items in property_obj.features.items():
        parts.append(f"{category}: {', '.join(items)}")

return " ".join(parts)

```

Similarity Search Algorithm

```python def _cosine_similarity(vec1: List[float], vec2: List[float]) -> float: """Calculate cosine similarity between two vectors""" dot_product = sum(a * b for a, b in zip(vec1, vec2)) magnitude1 = sum(a * a for a in vec1) ** 0.5 magnitude2 = sum(b * b for b in vec2) ** 0.5

if magnitude1 == 0 or magnitude2 == 0:
    return 0

return dot_product / (magnitude1 * magnitude2)

```

Authentication & Security

JWT Implementation

```python

Token configuration

JWT_ACCESS_TOKEN_EXPIRES = timedelta(hours=24) JWT_REFRESH_TOKEN_EXPIRES = timedelta(days=30)

class User(db.Model): def get_tokens(self): """Generate JWT tokens for user""" access_token = create_access_token(identity=str(self.id)) refresh_token = create_refresh_token(identity=str(self.id)) return { 'access_token': access_token, 'refresh_token': refresh_token } ```

Security Middleware

```python

Rate limiting configuration

limiter = Limiter( key_func=get_remote_address, default_limits=["200 per day", "50 per hour"] )

CORS configuration

CORS(app, origins=[ "http://localhost:3000", "https://try.homester.in" ])

Input validation

def validate_email(email: str) -> bool: """Validate email format using regex"""

def validate_password(password: str) -> bool: """Validate password strength requirements""" ```

Data Flow

Property Search Flow

``` 1. User Request ├── GET /api/properties/search?min_price=300000&bedrooms=3 └── Authorization: Bearer

  1. Authentication Middleware ├── Validate JWT token ├── Extract user identity └── Authorize request

  2. Input Validation ├── Validate query parameters ├── Sanitize input values └── Apply default limits

  3. Service Layer ├── PropertyService.search_properties() ├── Build SQLAlchemy query with filters └── Execute with pagination

  4. Database Query ├── SELECT * FROM properties ├── WHERE price >= 300000 AND bedrooms = 3 └── ORDER BY created_at DESC LIMIT 20

  5. Response Formation ├── Serialize property objects ├── Add pagination metadata └── Return JSON response ```

AI Chat Flow

``` 1. User Message ├── POST /api/chat/chat ├── {"message": "I need a house in Texas under $500k"} └── Authorization: Bearer

  1. Message Processing ├── Save user message to database ├── Retrieve conversation history └── Prepare context for AI

  2. AI Analysis ├── Send conversation to OpenAI GPT-4o ├── Extract search intent and parameters └── Generate conversational response

  3. Property Search ├── Generate embedding for user query ├── Calculate similarity with property embeddings ├── Rank results by relevance score └── Fallback to traditional search if needed

  4. Response Generation ├── Combine AI response with property data ├── Save assistant message to database └── Return structured response with properties ```

Deployment Architecture

Docker Configuration

```dockerfile FROM python:3.11-slim

WORKDIR /app

Install dependencies

COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt

Copy application code

COPY . .

Expose port

EXPOSE 5000

Run with Gunicorn

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--reuse-port", "--reload", "main:app"] ```

Kubernetes Deployment

yaml apiVersion: apps/v1 kind: Deployment metadata: name: homester-api spec: replicas: 3 selector: matchLabels: app: homester-api template: metadata: labels: app: homester-api spec: containers: - name: api image: homester-api:latest ports: - containerPort: 5000 env: - name: DATABASE_URL valueFrom: secretKeyRef: name: homester-secrets key: database-url - name: OPENAI_API_KEY valueFrom: secretKeyRef: name: homester-secrets key: openai-api-key

Production Environment

```yaml Infrastructure: Web Server: Nginx (reverse proxy, SSL termination) Application Server: Gunicorn (multiple workers) Database: PostgreSQL (primary + read replicas) Cache: Redis (session storage, rate limiting) CDN: CloudFront (static assets) Monitoring: Prometheus + Grafana Logging: ELK Stack (Elasticsearch, Logstash, Kibana)

Scaling Strategy: Horizontal: Multiple API server instances Database: Read replicas for queries Cache: Redis cluster for high availability CDN: Global distribution for assets ```

Performance & Scalability

Database Optimization

```python

Connection pooling configuration

SQLALCHEMY_ENGINE_OPTIONS = { "pool_recycle": 300, # Recycle connections every 5 minutes "pool_pre_ping": True, # Validate connections before use "pool_size": 10, # Base connection pool size "max_overflow": 20 # Maximum additional connections }

Query optimization with indexes

def search_properties(filters, page=1, per_page=20): """Optimized property search with indexed columns""" query = Property.query

# Use indexed columns for filtering
if filters.get('min_price'):
    query = query.filter(Property.price >= filters['min_price'])

if filters.get('bedrooms'):
    query = query.filter(Property.bedrooms == filters['bedrooms'])

# Pagination for large result sets
return query.paginate(page=page, per_page=per_page)

```

Caching Strategy

```python

Redis integration for caching

@cache.memoize(timeout=300) def get_property_statistics(): """Cache expensive statistical queries"""

Rate limiting with Redis backend

RATELIMIT_STORAGE_URL = "redis://localhost:6379/1" ```

Async Support

```python

Ready for async operations

async def generate_embeddings_batch(properties: List[Property]): """Async batch processing for embeddings""" tasks = [generate_embedding(prop.description) for prop in properties] return await asyncio.gather(*tasks) ```

Development Guidelines

Code Organization

Backend Structure: ├── api/ # API endpoints (Flask-RESTX namespaces) │ ├── __init__.py │ ├── auth.py # Authentication endpoints │ ├── properties.py # Property management │ └── chat.py # AI chat interface ├── services/ # Business logic layer │ ├── __init__.py │ ├── ai_service.py # OpenAI integration │ ├── property_service.py # Property operations │ └── embedding_service.py # Vector embeddings ├── utils/ # Utility functions │ ├── __init__.py │ ├── auth.py # Authentication helpers │ └── validators.py # Input validation ├── models.py # SQLAlchemy models ├── config.py # Configuration management ├── app.py # Flask application factory └── main.py # Application entry point

Error Handling

```python

Consistent error responses

@api.errorhandler(ValidationError) def handle_validation_error(error): return {'message': 'Validation failed', 'errors': error.messages}, 400

@api.errorhandler(NotFound) def handle_not_found(error): return {'message': 'Resource not found'}, 404

@api.errorhandler(Unauthorized) def handle_unauthorized(error): return {'message': 'Authentication required'}, 401 ```

Testing Strategy

```python

Unit tests for services

def test_property_search(): """Test property search functionality""" filters = {'min_price': 300000, 'bedrooms': 3} result = PropertyService.search_properties(filters) assert len(result['properties']) > 0

Integration tests for API endpoints

def test_auth_endpoint(client): """Test user registration endpoint""" response = client.post('/api/auth/register', json={ 'email': '[email protected]', 'password': 'securepassword123' }) assert response.status_code == 201 ```

Environment Configuration

```python

config.py - Environment-based configuration

class Config: # Database SQLALCHEMY_DATABASE_URI = os.environ.get('DATABASE_URL')

# Authentication
JWT_SECRET_KEY = os.environ.get('JWT_SECRET_KEY')

# AI Services
OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY')

# Performance
SQLALCHEMY_ENGINE_OPTIONS = {
    "pool_recycle": 300,
    "pool_pre_ping": True,
    "pool_size": int(os.environ.get('DB_POOL_SIZE', 10))
}

```

This architecture provides a robust, scalable foundation for the Homester real estate platform, combining traditional REST API patterns with modern AI capabilities while maintaining production-ready performance and security standards.


← API Home | README | API Documentation