The problem: joins everywhere
Pull attributes from several related models in one query, across a lot of rows, and your SQL slows down fast.
Classic example, with where clauses producing a heavy join:
# app/models/article.rb
class Article < ActiveRecord::Base
belongs_to :state # foreign_key: state_id
belongs_to :category # foreign_key: category_id
belongs_to :user # columns: id, username (index), user_id (fk, index)
end
# app/controllers/articles_controller.rb
def show
Article.includes([:state, :category]).
where("states.name" => "published",
"categories.name" => "hiking",
"articles.user_id" => 123)
end
Multiple joins, and it gets worse as volume grows.
Scopes make it readable, not faster
You can refactor the query into scopes:
# app/models/article.rb
class Article < ActiveRecord::Base
belongs_to :state
belongs_to :category
belongs_to :user
scope :for_state, -> (name) { joins(:state).where(states: { name: name }) }
scope :for_category, -> (name) { joins(:category).where(categories: { name: name }) }
end
# Usage:
Article.for_state("published").for_category("hiking")
Scopes are modular and maintainable. But the SQL is almost identical, so performance does not change. Check it with .to_sql in your console.
The real fix: denormalize
Normalization protects data integrity. It can also hurt reads.
Denormalizing trades a little redundancy for faster reads. If query speed matters here, reconsider the schema:
# app/models/article.rb
class Article < ActiveRecord::Base
# Enums simulate what a related table would have done with less overhead
STATES = %w(draft review published archived)
CATEGORIES = %w(tips faqs misc hiking)
validates :state, inclusion: { in: STATES }
validates :category, inclusion: { in: CATEGORIES }
end
# Easier query without joins:
Article.where(state: "published", category: "hiking")
If you are not multi-tenant (not a SaaS) and the categorization is simple and static, use enums on the table instead of separate tables. A single blog for one user does not need category tables.
Rule of thumb
Do not normalize something that does not have its own controller.
Take a tagging system. You could model it with has_and_belongs_to_many, or with a jsonb array column. HABTM is worth it only if users create tags with names and descriptions and search within them. Otherwise, denormalize the complexity away.
Why this matters
Most performance problems get solved by defining the problem better and removing complexity, not by piling on more code.
That extra complexity is usually there because nobody spent the time to question the original requirements. Debating the design is not the liability. Coding something you do not understand is, because it ends in more meetings and more PRs to untangle the code later.