Indexing content
Outline
- Architectural Choice: Leverage Search and Navigation for .NET-heavy sites or Optimizely Graph for decoupled, SaaS-first delivery
- Sync Strategy: Combine real-time IContentEvents for immediate updates with scheduled jobs for full-tree integrity
- Precision Control: Use the [Searchable] attribute and Conventions API to eliminate index bloat and technical noise
- Security Default: Ensure data privacy by enforcing FilterForVisitor() and the IExcludeFromSearch interface
In Optimizely CMS 12, indexing is the process of extracting, transforming, and sending content data to a specialized search engine to enable high-performance retrieval and filtering. This architectural component is essential for building complex navigation systems, faceted search interfaces, and omnichannel data delivery. Technical teams must architect the indexing pipeline to ensure data precision while maintaining system performance within PaaS constraints.
1. Indexing Engine Architecture
Optimizely CMS 12 primarily utilizes two indexing technologies. Understanding their trade-offs is essential before committing to an architecture.
Indexing Technologies
Select an option to expand and read the details.
Optimizely Search and Navigation (formerly Find) ▼
Historically the standard for CMS indexing, this service uses a proprietary .NET client to push content to an Elasticsearch-based back end. It is tightly coupled with the IContent repository and supports advanced features like Unified Search, Best Bets, and LINQ-based query composition. Best suited for .NET-heavy monolithic sites where search and CMS rendering are co-located.
Optimizely Graph ▼
The modern, SaaS-based GraphQL service. It acts as a platform-agnostic content hub, allowing CMS data to be indexed and queried via a strongly-typed GraphQL schema. In PaaS environments, Optimizely Graph is preferred for high-scale, multi-platform architectures where content delivery is decoupled from the web application - for example, headless front-ends, mobile apps, or digital signage.
2. Managing the Indexing Pipeline
The indexing process is handled through two distinct mechanisms to ensure both real-time accuracy and long-term data integrity.
Event-Driven Indexing
By default, Optimizely listens to the IContentEvents pipeline. When a content item is published or deleted, the system automatically triggers a delta indexing request for that specific item, ensuring search results remain synchronized with the editorial state without manual intervention.
Scheduled Jobs
For bulk operations or repairing index discrepancies - for example, after a database restore - the "Optimizely Search and Navigation Indexing Job" or "Content Graph Sync Job" must be executed. These jobs perform a full crawl of the content tree, ensuring all nodes adhere to the current code conventions.
3. Controlling Index Inclusions and Exclusions
To prevent performance degradation and index bloat, technical governance must be applied to determine exactly which content and properties are sent to the search engine. There are two levels of control.
Property-Level Control: The [Searchable] Attribute
The [Searchable] attribute controls whether a property's value is included in the default full-text index. Technical data, structural IDs, or configuration fields should be excluded to keep the index clean and relevant.
Type-Level Control: Programmatic Conventions
To exclude entire content types - for example, technical blocks or container pages - use the Conventions API inside an initialization module. This is a one-time registration that applies globally without requiring attribute decoration on every instance.
4. Performance Optimization Strategies
Indexing can be resource-intensive. Implementing these strategies minimizes the impact on the application's runtime performance.
ContentArea Depth Management
By default, Optimizely does not index content inside a ContentArea deeply, to avoid circular reference loops and massive document sizes. To include specific block data in a page's index, the IndexInContentAreas attribute must be applied to the block type definition.
Batching and Parallelism (Optimizely Graph)
When using Optimizely Graph, batching parameters can be tuned in Program.cs. Increasing the grace period reduces the frequency of API calls during heavy editorial activity.
Pro tip: In a busy editorial environment, a BufferedIndexingGracePeriod of 10-15 seconds dramatically reduces the number of individual sync API calls without meaningfully delaying search result updates for visitors.
5. Security and Governance in Indexing
Search results must always respect the CMS security model. If sensitive data is indexed and retrieved via a client-side API without proper filtering, it presents a security risk regardless of CMS access controls.
-
Filtering for Visitor: Always apply
.FilterForVisitor()in search queries to ensure the engine only returns content the current user is authorized to view. This must be applied at query time, not only at indexing time. -
Excluding Sensitive Types: If a content type implements
IExcludeFromSearch, the indexing engine will automatically honor theExcludeFromSearchproperty, providing a code-based circuit breaker for technical content governance.
Important: Indexing data does not apply CMS security automatically. A content item that is indexed but not filtered at query time may be accessible to unauthorized users via the search API, even if the CMS page itself is access-controlled.
Conclusion
Efficient indexing in Optimizely CMS 12 requires a disciplined approach to content modeling and an understanding of the technical contract between .NET code and the search back end. By utilizing property attributes for field-level control, initialization modules for type-level conventions, and tuned batching parameters, technical teams can deliver a search experience that is both highly relevant and computationally efficient.
