Are your AAD logs becoming too expensive in Sentinel? I have an idea about why this happens. These logs seem to be mainly designed for reporting purposes, and optimal sizing might not have been a key focus. In my experience, these logs can grow rapidly and become quite costly to maintain in Sentinel.
The two main types of Entra logs that contribute to this are the interactive and non-interactive sign-in logs, with the latter being the largest. Interactive logs track user-driven activities, while non-interactive logs capture activities performed by a client app or OS components on behalf of a user.
Interactive logs are influenced by user patterns, conditional access policies, and token refresh settings (usually 90 days by default, but many organizations set shorter intervals). Non-interactive log frequency is determined by the apps themselves, which is largely outside of our control, especially for apps like Teams and Exchange Online.
These logs are available for 90 days, and Entra provides a reporting tool to help you explore them.
Why would you send these logs to Sentinel?
- For dashboarding? Entra already provides an excellent dashboard, far better than any workbook.
- For alerting? If you’ve integrated with M365D, this might not be necessary. You can leverage the integration between AADIP and M365D for alerting.
- For archival? Entra only retains the data for 90 days, though some of this data may also exist in the M365D logs.
Considering these options, using Sentinel for log ingestion and archival might become too expensive.
What are your options when costs get too high?
- Turn off Entra ID log integration with Sentinel: At least turn off the large non-interactive logs and rely on M365D for alerting and archival.
- Switch to the Basic tier in Log Analytics: This may reduce ingestion costs, but logs will be archived after 8 days and won’t reduce archival expenses.
- Send logs to Azure Data Explorer via an event hub: This significantly lowers both ingestion and archival costs. However, the logs won’t be directly available for Sentinel alert rules, and the setup can be more complex.
If you need further help, key non-interactive log columns include Identity, Status, ResourceDisplayName, IPAddress, AuthenticationProcessingDetails, and AppDisplayName. You can summarize these or combine them with Identity.
Example:
AADNonInteractiveUserSignInLogs
| summarize count() by Identity, ResourceDisplayName | sort by count_
Or:
AADNonInteractiveUserSignInLogs
| summarize count() by Status, AuthenticationProcessingDetails | sort by count_
The goal is to identify noisy inputs and work towards optimizing them.
If you’re applying a transformation filter, consider dropping the following columns: ConditionalAccessPolicies, Type, TokenIssuerType, TokenIssuerName, TenantId, SourceSystem, and others. These columns provide low value or are duplicates, which can help cut the log size by 50% without compromising forensic value.
You may also want to check for duplicate records. If you notice a large number of duplicates, it could be worth opening a support case.
Example:
AADNonInteractiveUserSignInLogs // only a few hours of data is sufficient
| extend T2 = replace_string(tostring(CreatedDateTime), "/", "")
| extend CheckString = strcat(Identity, AppDisplayName, AppId, HomeTenantId, AutonomousSystemNumber, T2)
| summarize count() by CheckString