Complete step-by-step guide for implementing Amazon S3 Vectors with semantic search capabilities
This implementation demonstrates Amazon S3 Vectors, a native AWS service for storing and searching vector embeddings with sub-second performance. We built a complete semantic search solution with:
s3vectors:CreateVectorBucket
s3vectors:CreateIndex
s3vectors:PutVectors
s3vectors:ListVectors
s3vectors:QueryVectors
s3vectors:GetVectors
aws s3vectors create-vector-bucket \
--vector-bucket-name s3vector \
--region us-east-1
Result: Native S3 Vector bucket created in us-east-1 region
aws s3vectors create-index \
--vector-bucket-name s3vector \
--index-name documents \
--dimension 384 \
--distance-metric cosine \
--data-type float32 \
--region us-east-1
Result: Vector index "documents" with 384 dimensions and cosine distance metric
# List vector buckets
aws s3vectors list-vector-buckets --region us-east-1
# List indexes
aws s3vectors list-indexes \
--vector-bucket-name s3vector \
--region us-east-1
Created upload_vectors.py to generate sample document vectors:
import json
import numpy as np
# Sample document vectors (384 dimensions)
documents = [
{
"key": "doc_001",
"data": {
"float32": np.random.rand(384).astype(np.float32).tolist()
},
"metadata": {
"document_name": "annual_report_2024.pdf",
"document_type": "report",
"department": "finance",
"author": "Finance Team",
"year": "2024"
}
}
# ... additional documents
]
# Save vectors to JSON file
with open('vectors.json', 'w') as f:
json.dump(documents, f, indent=2)
All documents include realistic business content with proper metadata schema.
python3 upload_vectors.py
Generates vectors.json with properly formatted vector data
aws s3vectors put-vectors \
--vector-bucket-name s3vector \
--index-name documents \
--vectors file://vectors.json \
--region us-east-1
Result: 3 document vectors uploaded successfully
S3 Vectors requires specific JSON format:
{
"key": "doc_001",
"data": {
"float32": [0.123, 0.456, 0.789, ...]
},
"metadata": {
"document_name": "annual_report_2024.pdf",
"document_type": "report",
"department": "finance"
}
}
# List uploaded vectors
aws s3vectors list-vectors \
--vector-bucket-name s3vector \
--index-name documents \
--region us-east-1
# Get specific vector details
aws s3vectors get-vectors \
--vector-bucket-name s3vector \
--index-name documents \
--vector-keys doc_001 \
--region us-east-1
aws s3vectors query-vectors \
--vector-bucket-name s3vector \
--index-name documents \
--query-vector file://query_vector.json \
--max-results 5 \
--region us-east-1
Created interactive S3 Vectors demo page with:
Updated all website pages to include S3 Vectors navigation:
<li class="nav-item">
<a class="nav-link" href="s3-vectors.html">S3 Vectors Demo</a>
</li>
<li class="nav-item">
<a class="nav-link" href="s3-vectors-implementation.html">Implementation Guide</a>
</li>
# GitHub repository updates
git add .
git commit -m "Add S3 Vectors implementation and documentation"
git push origin main
# Amplify auto-deployment triggered
# Live URL: https://dfitqm3lm3maf.amplifyapp.com
s3vectordocumentsVector Dimensions
Query Response Time
Cost Reduction vs Traditional DBs
AWS Native Integration
awsweek2.0/
├── s3-vectors.html # Interactive demo page
├── s3-vectors-implementation.html # This documentation
├── sample-documents/ # Sample document content
│ ├── annual_report_2024.md
│ ├── product_manual.md
│ ├── meeting_notes.md
│ └── README.md
├── document_vector_storage.py # Vector storage script
├── query_document_vectors.py # Semantic search script
└── upload_vectors.py # Vector upload utility