Mobile Engineering — Architecture, Performance, and Production Reality
Mobile is not “frontend for small screens.” It is a fundamentally different engineering discipline — constrained hardware, hostile networks, gatekept distribution, users who will uninstall your app in under 3 seconds if it stutters. This chapter covers what a senior mobile engineer actually needs to know: the architecture patterns that survive production, the performance constraints that desktop engineers never think about, the infrastructure that makes updates possible without waiting 3 days for App Store review, and the system design patterns that interviewers use to separate senior candidates from everyone else.Real-World Stories: Why Mobile Engineering Is Hard
Airbnb's React Native Divorce -- A $2B Company Walks Away
Airbnb's React Native Divorce -- A $2B Company Walks Away
Instagram -- How a 13-Person Team Scaled to 100 Million Users on Mobile
Instagram -- How a 13-Person Team Scaled to 100 Million Users on Mobile
Spotify's Mobile Journey -- From WebView to Native to Embedded Rust
Spotify's Mobile Journey -- From WebView to Native to Embedded Rust
Part I — Mobile Architecture
1. Mobile App Architecture Patterns
Architecture patterns in mobile are not academic exercises — they determine whether your codebase survives past three engineers, whether your crash rate stays below 1%, and whether new features take days or months to ship.1.1 MVC (Model-View-Controller)
Apple’s original recommended pattern for iOS development. The UIViewController owns both the view lifecycle and the business logic, which leads to files with 2,000+ lines in any non-trivial app. How it works on iOS:- Model: Data structures and business rules (
User,PaymentService) - View: UIKit views or storyboards that display data
- Controller: UIViewController that mediates between Model and View — and handles navigation, networking, formatting, animation, delegation, and everything else
1.2 MVP (Model-View-Presenter)
MVP was the Android community’s answer to Activity bloat. The key insight: extract the logic out of the Activity/Fragment into a Presenter that has no Android framework dependencies. How it works:- Model: Data layer (repositories, network, database)
- View: Activity/Fragment implements a View interface (
LoginView.showError(),LoginView.navigateToHome()) - Presenter: Holds a reference to the View interface, contains all presentation logic, is unit-testable because it depends on an interface, not Android classes
1.3 MVVM (Model-View-ViewModel)
The dominant pattern in modern mobile development. Android Jetpack’s ViewModel + LiveData/StateFlow made it the default on Android. SwiftUI’s@Observable and Combine made it natural on iOS.
How it works:
- Model: Data layer (same as MVP)
- View: Activity/Fragment/SwiftUI View observes the ViewModel’s state
- ViewModel: Exposes observable state. Does not hold a reference to the View. Survives configuration changes on Android.
- No reference to the View — eliminates the leak/crash category that plagued MVP
- Survives configuration changes — Android’s
ViewModelsurvives Activity recreation - Reactive by default — LiveData/StateFlow/Combine naturally drive UI updates
- Testable — ViewModel is a plain class with observable outputs; test by asserting on state emissions
1.4 MVI (Model-View-Intent)
MVI brings unidirectional data flow to mobile — inspired by Redux, Elm, and functional reactive programming. The state is a single immutable object. Every user action is an Intent. Every Intent produces a new State. The View renders the State. The flow:1.5 VIPER
VIPER is the “enterprise” mobile architecture, popular in large iOS codebases at banks, Uber (early versions), and other organizations with strict separation-of-concerns requirements. The components:- View: Displays data, delegates user actions to the Presenter
- Interactor: Contains business logic, talks to data layer
- Presenter: Mediates between View and Interactor, formats data for display
- Entity: Plain data models
- Router: Handles navigation between screens
1.6 Clean Architecture for Mobile
Uncle Bob’s Clean Architecture adapted for mobile. The key idea: dependencies point inward. The inner layers (domain/business logic) know nothing about the outer layers (UI, network, database).GetNewsResourcesUseCase) that are pure Kotlin — no Android imports, no Hilt annotations, fully testable with plain JUnit.
Architecture Pattern Comparison
| Pattern | Files Per Screen | Testability | Learning Curve | Best For | Used By |
|---|---|---|---|---|---|
| MVC | 1-2 | Low | Low | Prototypes, small apps | Early iOS apps |
| MVP | 3-4 | High | Medium | Legacy Android apps | Pre-2018 Android |
| MVVM | 2-3 | High | Medium | Most modern apps | Instagram, Google apps |
| MVI | 3-4 | Very High | High | Complex state, debugging | Twitter/X, Cash App |
| VIPER | 5-6 | Very High | Very High | Large teams, strict boundaries | Uber (early), banking apps |
| Clean Architecture | 4-6 | Very High | High | Long-lived enterprise apps | Google samples, enterprise |
AI-Assisted Engineering Lens: Architecture
AI-Assisted Engineering Lens: Architecture
Interview: How do you choose an architecture pattern for a new mobile app?
Interview: How do you choose an architecture pattern for a new mobile app?
- How many engineers will touch this codebase? Solo or 2-3 engineers: MVVM is the sweet spot — enough structure to be testable, little enough ceremony to stay fast. 5-10 engineers: MVVM with Clean Architecture layers to enforce module boundaries. 10+ engineers: Consider MVI or VIPER for strict isolation, but only if the coordination cost without it is measurable.
- How complex is the state management? A content-consumption app (news reader, social feed) has simple state — MVVM is fine. A financial trading app with real-time data, complex form validation, and undo/redo needs MVI’s single-state-tree and intent-based mutations to stay debuggable.
- What is the team’s experience? Introducing VIPER to a team that has never used anything beyond MVC will slow them down for months. Ship with MVVM, let pain reveal itself, then evolve. Architecture is not a day-one decision that is permanent — it is a living choice that should respond to real problems.
- “I always use MVVM because it is the standard.” — No reasoning behind the choice. Architecture selection without trade-off analysis signals tutorial-driven thinking.
- “We should use Clean Architecture with VIPER for everything to keep it clean.” — Over-engineering without considering team size or app complexity. Ceremony for ceremony’s sake.
- “Architecture does not matter much, you can always refactor later.” — Ignores that mobile refactors require App Store releases and cannot be hot-patched.
- “I would start with MVVM and evolve to MVI only if state debugging becomes painful — I have seen teams adopt MVI prematurely and spend 40% of their time writing boilerplate intents for simple screens.”
- “The architecture I pick is a function of team size, state complexity, and release cadence. For a 3-person team shipping weekly, MVVM with Clean Architecture layers is the sweet spot.”
- “I evaluate architecture patterns using a ceremony-to-value ratio — how much boilerplate does this pattern demand per screen relative to the testability and team-scaling benefits it provides?”
- Failure mode: “What happens when an architecture migration stalls halfway? You end up with two patterns coexisting — new engineers do not know which to use, bugs appear at the seams between old and new screens, and testing coverage fragments. I have seen a 6-month ‘modernization’ create more bugs than it fixed.”
- Rollout: “Strangle pattern behind feature flags. Migrate one screen at a time, starting with the screen that changes most frequently. Ship each migrated screen behind a flag so you can revert to the old implementation without a store release.”
- Rollback: “If the new architecture causes crash rate regression, disable the feature flag and revert to the legacy screen. The old code stays in the binary until the new version is stable across 100% of users for two release cycles.”
- Measurement: “Track crash-free rate, ANR rate, and developer velocity (time from ticket to merged PR) per screen. If the migrated screen has worse reliability or the same velocity, the migration is not paying for itself.”
- Cost: “Architecture migration has a hidden cost: every engineer must learn the new pattern, code reviews take longer during the transition, and onboarding new hires is harder when two patterns coexist. Budget 20-30% velocity loss during migration.”
- Security/Governance: “Module boundaries in Clean Architecture enforce access control at the code level — a feature module cannot directly access another feature’s data layer. This matters for compliance-sensitive apps where data isolation between features is auditable.”
What changes in the real app store world?
What changes in the real app store world?
- Architecture mistakes are costlier to fix. A poorly chosen pattern that causes state bugs or crashes cannot be hot-patched. Feature flags become the escape hatch: wrap new architectural patterns behind flags so you can revert to the old codepath without a store release.
- Migration must be invisible. When you strangle a Massive View Controller into MVVM, you cannot do a “big bang” rewrite and ship it all at once. If it breaks, you are stuck for days. Ship screen-by-screen, behind flags, and monitor crash-free rates at each stage.
- Your architecture must support staged rollout. If your architecture tightly couples screens (Screen A directly imports and instantiates Screen B), you cannot roll out a rewritten Screen B to 5% of users. Feature modules with navigation abstraction are not just “clean code” — they are operational necessities.
- Process death testing is non-negotiable before release. On the web, a user refreshes the page. On mobile, the OS silently kills your app and restores it — and your architecture must handle that. Every architecture decision should pass the “what happens after process death?” test before it ships through the store.
- Failure mode: “A team ships an architecture migration without feature flags. The new pattern causes a subtle state restoration bug after process death. Crash-free rate drops from 99.5% to 97% — but only on devices with <3GB RAM where the OS kills the app more aggressively. The team cannot roll back without another store submission.”
- Rollout: “Every architecture change ships behind a flag. First 1% rollout for 48 hours, monitoring crash-free rate segmented by memory tier. Only expand after confirming stability on low-end devices.”
- Rollback: “The flag is the rollback. Disable it server-side and the old code path activates within minutes. The old code stays in the binary for two full release cycles.”
- Measurement: “Track crash-free rate per screen, ANR rate per screen, and time-to-interactive per screen. Compare old vs new implementation side-by-side using the feature flag as an A/B split.”
- Cost: “App Store review latency means every bug that escapes to production costs 24-48 hours of user pain minimum. This makes mobile architecture mistakes 10x more expensive than web architecture mistakes.”
- Security/Governance: “Apple and Google review processes flag apps with unusual runtime behavior. Architectures that load code dynamically (beyond React Native’s standard JS execution) risk review rejection. Keep your architecture choices within platform-sanctioned patterns.”
2. Native vs Cross-Platform
This is the single most consequential technical decision in mobile engineering. It affects hiring, performance, maintenance cost, time-to-market, and the user experience. And the right answer genuinely depends on your situation.2.1 Native Development
iOS (Swift/SwiftUI):- Full access to every Apple API on day zero
- SwiftUI (declarative, 2019+) and UIKit (imperative, mature)
- Xcode is the only IDE option — and its build times are a known pain point
- Swift’s type system, value types, and protocol-oriented programming produce safe, predictable code
- SwiftUI adoption: by 2024, most new feature development at major companies uses SwiftUI, but UIKit remains for complex custom components and backwards compatibility
- Kotlin became the preferred language in 2019; Google declared it “Kotlin-first” in 2020
- Jetpack Compose (declarative UI, stable since 2021) is the future; XML layouts are legacy
- Android Studio (IntelliJ-based) has better tooling than Xcode for refactoring and debugging
- Fragment/Activity system is powerful but has notorious lifecycle complexity
- Device fragmentation: thousands of device models, screen sizes, and OS versions to support
2.2 React Native
React Native lets you write mobile apps in JavaScript/TypeScript using React components that render to native views (not a WebView). The Old Architecture (pre-2022):- JavaScript thread runs the app logic
- A Bridge serializes JSON messages between JS and native threads
- The Bridge was a bottleneck — every JS-to-native call required async JSON serialization/deserialization
- Touch events, scroll positions, and animations that crossed the bridge felt laggy
JSI (JavaScript Interface)
Fabric (New Rendering System)
TurboModules
2.3 Flutter
Flutter takes a radically different approach: it does not use platform UI components at all. Instead, it renders every pixel itself using the Skia graphics engine (and increasingly Impeller, a newer engine optimized for mobile GPUs). How Flutter renders:- You write Dart code using Flutter’s widget system
- Flutter compiles Dart to native ARM code (AOT compilation)
- At runtime, Flutter uses Skia/Impeller to draw every pixel on a raw canvas surface
- Platform UI components (UIKit views, Android Views) are not used — Flutter draws its own buttons, text fields, scroll views, everything
- Pixel-perfect consistency across platforms — the UI looks identical on iOS and Android because it is drawn by the same engine
- No bridge overhead — there is no JS-to-native communication because there is no JavaScript. Dart compiles to native code
- The downside: platform fidelity is lower. A Flutter app does not automatically get iOS-specific scroll physics, Android-specific ripple effects, or platform-native text selection behavior. Flutter approximates these, but the approximation is noticeable to discerning users
2.4 Kotlin Multiplatform (KMP)
KMP is the newest serious contender, and it takes the most pragmatic approach: share business logic, keep UI native. How it works:- Write shared business logic in Kotlin (networking, data models, business rules, local storage)
- Compile that Kotlin code to JVM bytecode (Android), native ARM via LLVM (iOS), or JavaScript (web)
- UI layer is fully native: Jetpack Compose on Android, SwiftUI on iOS
- Use
expect/actualdeclarations for platform-specific implementations (like accessing Keychain on iOS vs Keystore on Android)
- Netflix (networking layer)
- Cash App (shared business logic)
- Philips (healthcare apps)
- JetBrains (all their mobile apps)
- Google (some internal projects)
The Decision Matrix
- Choose Native When
- Choose React Native When
- Choose Flutter When
- Choose KMP When
- User experience is the product differentiator — banking apps, consumer social, photo/video editing
- Deep platform integration required — AR/VR, health sensors, accessibility features, complex animations
- You can afford two dedicated teams — the hiring cost is real but the quality ceiling is highest
- App Store performance is critical — native apps have the lowest cold start times and smoothest scroll performance
- Examples: Instagram, Uber (rider app), Apple’s own apps, most banking apps
The Airbnb vs Shopify Paradox
Both Airbnb and Shopify are large, well-funded companies with world-class engineering teams. Airbnb left React Native. Shopify adopted it. How can the same technology be wrong for one and right for the other? Airbnb’s context (2016-2018):- Complex consumer app with heavy animations, maps, date pickers, payment flows
- Existing large native teams that resisted the abstraction
- Deep platform integration needs (ARKit, custom camera, complex gesture handling)
- “Write once, run anywhere” was the stated goal — and 30% of code still needed platform-specific versions
- Merchant-facing apps (point-of-sale, admin) with simpler UI needs
- Smaller mobile team relative to app count — they needed to ship multiple apps
- React Native’s New Architecture solved many of Airbnb’s performance complaints
- “Write once, adapt per platform” was the goal — more realistic than “write once, run anywhere”
AI-Assisted Engineering Lens: Cross-Platform
AI-Assisted Engineering Lens: Cross-Platform
expect/actual boundary is a frequent source of subtle behavioral differences.On-device AI as a cross-platform differentiator. Core ML (iOS) and TensorFlow Lite / MediaPipe (Android) have different model format requirements and performance characteristics. Cross-platform frameworks add an abstraction layer that can reduce on-device ML performance. For apps with AI features (real-time translation, image recognition, voice processing), the framework choice now includes “can this framework efficiently run on-device ML models?” as a selection criterion. Flutter and React Native require native module bridges for ML inference; KMP can call platform ML APIs directly from shared code.Interview: Your company is starting a new mobile app. How do you choose between native and cross-platform?
Interview: Your company is starting a new mobile app. How do you choose between native and cross-platform?
- Team composition. What does the existing team know? If I have 6 React engineers and 0 mobile engineers, React Native lets us ship in 8 weeks. Hiring two native teams takes 3 months before anyone writes a line of code.
- UI complexity. Is this a content app (lists, forms, text) or an experience app (custom animations, gestures, camera, AR)? Content apps are great cross-platform candidates. Experience apps need native.
- Platform API depth. Do we need Bluetooth, NFC, HealthKit, ARKit, or push notification customization beyond the basics? Each deep platform API is a potential pain point in cross-platform frameworks.
- Update velocity. Can we tolerate 2-3 day App Store review cycles, or do we need OTA updates? React Native with CodePush can push JS updates instantly. Native apps cannot.
- Long-term maintenance cost. Cross-platform saves upfront cost but can increase maintenance cost. Every major OS update risks breaking the framework’s abstraction layer. When iOS 18 ships a new API, native apps get it immediately. Cross-platform apps wait for the framework to support it.
- Choosing based on personal preference (“I like Dart”)
- Not considering the hiring market for each framework
- Assuming “cross-platform” means “half the work” (it is more like 70% of the work)
- Ignoring the long-term upgrade path when major OS versions ship
- “I would use Flutter because it is the fastest framework.” — No consideration of team skills, business context, or platform API needs. Framework loyalty over engineering judgment.
- “Cross-platform is always the right choice because it saves money.” — The 70%-not-50% reality of cross-platform effort is ignored. Total cost of ownership includes maintenance, OS update compatibility, and hiring.
- “We should just go native for everything.” — Ignores budget, team composition, and time-to-market constraints.
- “I would prototype the riskiest screen — the one with the deepest platform integration — in both the cross-platform framework and native. If the prototype takes 3x longer in the cross-platform framework, the time savings on simpler screens will not compensate.”
- “The framework choice is a 3-year decision, not a 3-month decision. I evaluate total cost of ownership: initial build, hiring pipeline for the chosen framework, major OS update compatibility, and the cost of eventually migrating if the framework loses momentum.”
- “I frame this as an amortized team cost problem. Two native teams cost 2x salary but ship platform-optimal experiences. One cross-platform team costs 1x salary but needs 1.3-1.5x time per feature and faces framework-specific friction.”
- Failure mode: “Choosing React Native for an AR-heavy app because the team knew React. Six months later, every AR feature requires custom native modules, the team is debugging JSI bridge issues for gesture-to-native-view synchronization, and the cross-platform advantage has evaporated. The ‘write once’ promise collapsed at the platform API boundary.”
- Rollout: “For a cross-platform migration, start with a single low-risk feature (settings screen, profile page) in the new framework. Ship it to 100% behind a flag. Measure crash-free rate, startup time impact, and developer velocity before migrating critical screens.”
- Rollback: “Keep the native implementation of critical screens in the binary for at least two release cycles after migrating to cross-platform. The feature flag lets you revert per-screen without a full framework rollback.”
- Measurement: “Compare: developer velocity (features shipped per sprint), crash-free rate by framework, cold start time regression, binary size increase, and hiring pipeline health (are candidates available for this framework?).”
- Cost: “Cross-platform saves 30-40% on initial development but can cost 20-30% more on maintenance during major OS updates (waiting for framework compatibility patches). Model the 3-year TCO, not just the MVP cost.”
- Security/Governance: “Cross-platform frameworks add a dependency on the framework’s security update cycle. A vulnerability in React Native’s JSI layer or Flutter’s Dart runtime must be patched by the framework team before you can ship the fix. Native apps depend only on platform SDKs, which Apple and Google patch on their own schedule.”
What changes in the real app store world?
What changes in the real app store world?
- Review rejection risk differs by framework. Apple occasionally rejects apps that use certain cross-platform patterns. Flutter and React Native are well-established, but custom bridge code or non-standard rendering techniques can trigger scrutiny. Native apps face fewer “how is this built?” rejections.
- OTA updates are a cross-platform superpower — with limits. React Native’s CodePush can bypass App Store review for JS changes, but Apple’s guidelines prohibit OTA updates that “change the app’s primary purpose.” If your OTA update is a bug fix, you are safe. If it adds a major feature, you risk rejection on your next store submission.
- Binary size and store thresholds. Apple imposes a 200MB cellular download limit. Cross-platform frameworks add framework overhead (Flutter: ~5-10MB, React Native: ~3-7MB, KMP: ~1-2MB). For apps already near the limit, this overhead matters.
- App Store promotional considerations. Apple features apps that showcase platform technologies (SwiftUI, ARKit, WidgetKit). Cross-platform apps are rarely featured because they do not demonstrate platform-native capabilities. If App Store featuring is part of your growth strategy, native gives you an edge.
- Staged rollout is your safety net regardless of framework. Both Google Play and the App Store support phased rollout. Use it religiously — 1% for 24 hours, check crash-free rate, then expand. This applies equally to native and cross-platform releases.
3. App Lifecycle and Navigation
Understanding the mobile app lifecycle is not optional — it is the difference between an app that works and an app that crashes, leaks memory, or loses user data.3.1 Android Activity Lifecycle
The lifecycle events that matter most in practice:| Event | When It Fires | What You Should Do | What Goes Wrong If You Do Not |
|---|---|---|---|
onCreate | Activity first created | Initialize UI, restore saved state from Bundle, set up ViewModel | N/A — you literally cannot skip this |
onResume | Activity becomes interactive | Resume camera/sensors, start location updates, refresh data | Stale data, camera not restarting after phone call |
onPause | Activity partially obscured | Pause camera/sensors, save draft data | Battery drain from sensors, data loss |
onStop | Activity no longer visible | Release heavy resources, unregister broadcast receivers | Memory leaks, battery drain |
onSaveInstanceState | Before potential destruction | Save UI state to Bundle (scroll position, form inputs, selected tab) | User rotates phone and loses all form input |
onDestroy | Activity being destroyed | Clean up final resources | Memory leaks |
3.2 iOS UIViewController Lifecycle
didReceiveMemoryWarning() to all view controllers. You must release cached data, images, and non-essential resources — or the OS will terminate your app. This is not a suggestion. Apps that ignore memory warnings get killed.
SwiftUI lifecycle (modern iOS):
UIHostingController still manages the UIKit lifecycle. Understanding UIKit lifecycle remains essential for debugging.
3.3 Navigation Patterns
Stack-based navigation: The fundamental pattern. Push a screen onto the stack, pop to go back. UINavigationController (iOS), NavHost with NavController (Android Jetpack Navigation),Stack.Navigator (React Navigation).
Tab-based navigation: Persistent bottom tabs for top-level destinations. Each tab maintains its own navigation stack. UITabBarController (iOS), BottomNavigationView (Android Material), Tab.Navigator (React Navigation).
Deep linking and Universal Links:
Deep linking lets external sources (push notifications, emails, web links, other apps) open specific screens in your app.
URL scheme deep links (legacy)
myapp://product/123. Simple but insecure — any app can register the same scheme. No verification that your app owns that scheme.Universal Links (iOS) / App Links (Android)
https://myapp.com/product/123 that open your app instead of the browser. Verified via a JSON file hosted on your domain (apple-app-site-association for iOS, assetlinks.json for Android). Secure because only the domain owner can host the verification file.Deferred deep links
NSUserActivity or manual state encoding in encodeRestorableState(with:) handles it. The key discipline: never navigate based on in-memory-only state.
Interview: Your app crashes when users return to it after a few hours in the background. How do you investigate?
Interview: Your app crashes when users return to it after a few hours in the background. How do you investigate?
-
Reproduce it deterministically. On Android:
adb shell am kill com.myapp. On iOS: use Xcode’s ‘Simulate Background Fetch’ or manually terminate the process in the App Switcher, then relaunch. This simulates process death reliably. - Check the crash stack. Is it a NullPointerException or force unwrap on state that should have been initialized? That is the signature of process death — code assuming state exists because it was set in the original launch, but the recreated Activity does not go through the same flow.
-
Audit state storage. Anything in a companion object, singleton, or regular ViewModel property dies with the process. Move critical state to:
- Android:
SavedStateHandlein the ViewModel,onSaveInstanceStateBundle, or Room database - iOS:
UserDefaultsfor small values,NSUserActivityfor UI state, Core Data for complex state
- Android:
- Fix the navigation. If the app uses deep links or intent-based navigation, ensure the destination screen can reconstruct itself from the navigation arguments alone, without relying on state set by a previous screen.
- Add process death to CI. Run Espresso/XCUITest with process death simulation as part of the test suite. If it passes without process death and fails with it, you have found a process-death-specific bug.”
- Blaming the crash on “low memory” without understanding that the OS recreates the app, not just kills it
- Storing critical state in singletons or companion objects
- Never testing process death scenarios (most teams do not)
- “It is probably a memory leak. I would add more RAM.” — Confuses process death with out-of-memory. Does not understand that the OS intentionally kills the app and expects it to restore gracefully.
- “I would add a try-catch around the crash.” — Treats the symptom, not the cause. The null state after process death will just produce wrong behavior instead of a crash.
- “We should just tell users to keep the app in the foreground.” — Blaming the user for OS behavior.
- “This is almost certainly a process death issue. The signature is: works fine during normal use, crashes only when returning after extended background time. The OS killed the process, recreated the Activity stack, and the code assumed in-memory state that no longer exists.”
- “My first step is to reproduce it deterministically with
adb shell am kill, then audit every screen for state that lives only in ViewModel properties withoutSavedStateHandlebacking.” - “I would add process death simulation to our CI pipeline so this class of bug is caught before release. Most teams never test this scenario, which is why it is the most common class of hard-to-reproduce crash.”
- Failure mode: “Process death crashes are insidious because they are non-deterministic in production — they depend on device memory pressure, which varies by device model and user behavior. A crash that affects 0.1% of sessions on a Pixel 7 might affect 5% on a budget device with 2GB RAM.”
- Rollout: “After fixing process death bugs, deploy behind a feature flag that gates the state restoration path specifically. Monitor crash-free rate segmented by memory tier (devices with <3GB vs 3-6GB vs >6GB).”
- Rollback: “If the fix introduces a regression (e.g., state is now persisted but restored incorrectly), disable the flag to revert to the previous behavior while a corrected fix ships.”
- Measurement: “Track ‘background return crash rate’ as a distinct metric from overall crash rate. Segment by time-in-background duration: crashes after 1 hour vs 4 hours vs 24 hours reveal different failure modes.”
- Cost: “Process death crashes disproportionately affect power users who multitask heavily — the same users most likely to leave negative reviews and churn. The business cost of a 0.5% process death crash rate is higher than a 0.5% cold-start crash rate.”
- Security/Governance: “State restoration must not inadvertently expose sensitive data. If a user backgrounds a banking app, the process dies, and restoration shows the previous screen with account details — that is a security violation if another person picks up the unlocked phone. Implement a re-authentication gate on sensitive screens after process death.”
SavedStateHandle, a CI step that runs every instrumented test with a process-death cycle injected, and an architecture guideline document that defines the state persistence contract for every data tier. The staff engineer recognizes that process death bugs are a systemic failure of the team’s development practices, not an individual code bug.AI-Assisted Engineering Lens: Lifecycle and Navigation
AI-Assisted Engineering Lens: Lifecycle and Navigation
Part II — Mobile Performance and Constraints
4. Mobile-Specific Constraints
Mobile devices are not small laptops. They have fundamentally different constraints, and ignoring those constraints produces apps that drain battery, drop frames, and get killed by the OS.4.1 Battery Optimization
Battery is the most precious resource on a mobile device. Users notice battery drain before they notice anything else, and “drains my battery” is the #2 reason for uninstalls (after crashes). Android background execution limits (doze mode, app standby): Since Android 6.0 (Marshmallow), Android aggressively restricts background activity:- Doze mode: When the screen is off and the device is stationary, the OS batches all network access, alarms, and jobs into infrequent “maintenance windows.” Your background sync that ran every 5 minutes? It now runs once per hour or less.
- App Standby Buckets (Android 9+): Apps are categorized into Active, Working Set, Frequent, Rare, and Restricted buckets based on recency of use. Rare apps get almost no background execution.
- Background execution limits (Android 8+): Apps cannot start background services freely. Must use
WorkManager(for deferrable work) or foreground services with a visible notification (for ongoing work like music playback).
- iOS is even more restrictive. Background execution is limited to specific modes: audio playback, location updates, VoIP, Bluetooth, background fetch (OS-controlled, not app-controlled), and push notification processing.
- Background App Refresh: The OS decides when to grant your app background execution time, based on usage patterns. If the user opens your app every morning at 8 AM, iOS will pre-fetch data around 7:50 AM. You cannot force a specific schedule.
- Background URLSession: For large downloads/uploads that must complete even if the app is backgrounded. The OS manages the transfer and wakes your app when it completes.
- Batch network requests. Instead of 10 individual API calls, batch into 1. Every radio wake-up costs significant battery.
- Use the radio wisely. The cellular radio has three states: idle (low power), connected (high power), and a “tail” state (still high power for 15-30 seconds after the last request, waiting for more). Sending a request every 20 seconds keeps the radio perpetually in the high-power state.
- Defer non-urgent work.
WorkManager(Android) andBGTaskScheduler(iOS) let the OS batch your work with other apps’ work, minimizing total radio and CPU wake-ups. - Avoid wake locks. A wake lock prevents the device from sleeping. Forgetting to release a wake lock is one of the fastest ways to drain a battery to zero.
4.2 Network Constraints
Mobile networks are fundamentally hostile compared to wired connections:| Constraint | WiFi | 4G LTE | 3G | Subway/Rural |
|---|---|---|---|---|
| Latency (RTT) | 5-30ms | 30-100ms | 100-500ms | 500ms-timeout |
| Bandwidth | 50-500 Mbps | 5-50 Mbps | 0.5-5 Mbps | 0-0.5 Mbps |
| Reliability | High | Medium | Low | Very low |
| Packet loss | <1% | 1-5% | 5-15% | 15-50% |
- Assume the network is unreliable. Every API call should have a timeout, retry logic, and a fallback behavior when offline.
- Design for high latency. A request that takes 5ms on WiFi might take 500ms on cellular. UI that blocks on network calls feels broken on cellular.
- Minimize request count. Each TCP connection establishment is expensive on cellular. HTTP/2 multiplexing and request batching are not optimizations — they are necessities.
- Handle transitions. Users walk from WiFi to cellular and back. Ongoing requests will fail. Your networking layer must detect connectivity changes and retry transparently.
4.3 Memory Pressure
Mobile devices have 3-8GB of RAM shared across all running apps. The OS will terminate background apps to reclaim memory, and it will terminate your foreground app if you exceed a memory threshold (typically 1-2GB on modern devices, less on older ones). Android memory management:onTrimMemory()callback with escalating severity levels (RUNNING_MODERATE, RUNNING_LOW, RUNNING_CRITICAL)- At RUNNING_LOW, release all caches, large bitmaps, and non-essential allocations
ActivityManager.getMemoryInfo()gives you current available memory
didReceiveMemoryWarning()— release everything non-essential- Jetsam (iOS’s memory killer) terminates apps that exceed their memory budget with no warning and no callback
- Use Instruments’ Allocations tool to track high water mark memory usage
4.4 Thermal Throttling
Heavy CPU/GPU usage causes the device to heat up. When it reaches a thermal threshold, the OS throttles CPU frequency — sometimes by 50% or more. An app doing complex image processing might start at 60fps, heat the device over 2-3 minutes, and drop to 20fps as thermal throttling kicks in. Mitigation strategies:- Profile sustained workloads, not peak bursts. A 2-second benchmark tells you nothing about real performance.
- Move heavy computation off the main thread and, ideally, to a background processing queue that can be paused.
- On iOS, use
ProcessInfo.ThermalStateto detect throttling and reduce workload. - On Android, use
PowerManager.THERMAL_STATUS_*(Android 11+).
5. Mobile Performance Optimization
5.1 Startup Time Optimization
App startup time is the single most impactful performance metric. Google’s research shows that 53% of users abandon a mobile site if it takes longer than 3 seconds to load. App expectations are even higher — users expect interactive content within 1-2 seconds. Three types of startup:| Type | Definition | Typical Target | What Happens |
|---|---|---|---|
| Cold start | App process does not exist. OS loads it from scratch. | < 1 second | Process creation, Application.onCreate(), first Activity/ViewController rendering |
| Warm start | Process exists but Activity was destroyed. | < 500ms | Activity.onCreate() re-runs but Application.onCreate() is skipped |
| Hot start | App was in background, brought to foreground. | < 200ms | Activity.onResume() runs, minimal work |
Measure first
adb shell am start -S -W com.myapp/.MainActivity gives you TotalTime. iOS: Instruments’ App Launch template. You cannot optimize what you have not measured.Minimize Application/AppDelegate initialization
Lazy-load dependencies
Optimize the first frame
Reduce binary size for faster loading
AI-Assisted Engineering Lens: Performance Optimization
AI-Assisted Engineering Lens: Performance Optimization
Application.onCreate() by synchronous calls to AnalyticsSDK.init() and FeatureFlagService.fetch(). Moving these to a background coroutine would save ~380ms.” This turns trace analysis from a specialist skill into an accessible workflow.On-device ML inference and startup trade-offs. Apps integrating on-device AI models (Core ML, TensorFlow Lite) face a new startup cost: model loading. A 50MB model loaded synchronously at startup adds 200-500ms. The pattern: lazy-load models on first use, warm them during splash screen display, and use quantized models (INT8 instead of FP32) to reduce both load time and memory footprint. Baseline Profiles on Android can pre-compile the ML inference hot paths.5.2 Rendering Performance (60fps and Beyond)
The human eye perceives smooth animation at 60 frames per second, which means each frame must complete in 16.6ms. On 120Hz devices (iPhone Pro, Samsung Galaxy S-series), the budget drops to 8.3ms per frame. Miss that budget, and the user perceives “jank” — visible stuttering. What happens in a frame:| Cause | How to Detect | Fix |
|---|---|---|
| Main thread blocking | Systrace/Instruments shows long task on main thread | Move work to background thread/coroutine |
| Overdraw | Android: Developer Options > Show GPU overdraw | Reduce overlapping backgrounds, flatten view hierarchy |
| Complex view hierarchy | Layout Inspector shows deep nesting | Use ConstraintLayout (Android), avoid nested ScrollViews |
| Large image decoding | Memory profiler shows spike during scroll | Decode at display size, use Glide/Coil/Kingfisher |
| RecyclerView/UICollectionView misconfiguration | Dropped frames during fast scroll | Use stable IDs, implement DiffUtil/NSDiffableDataSourceSnapshot |
| Jetpack Compose recomposition | Layout Inspector shows unnecessary recompositions | Use remember, derivedStateOf, stable keys, avoid creating objects in composition |
5.3 Image Loading and Caching
Images dominate mobile app memory usage and network bandwidth. A social media feed with 20 visible images, each at 1080x1080, would consume 88MB of memory at full RGBA resolution. Image libraries solve this with multi-level caching and efficient decoding. The image loading pipeline:| Library | Platform | Language | Key Differentiator |
|---|---|---|---|
| Glide | Android | Java/Kotlin | Lifecycle-aware, efficient memory management |
| Coil | Android | Kotlin-first | Coroutine-based, lighter than Glide, Compose-native |
| SDWebImage | iOS | Obj-C/Swift | Mature, feature-rich, WebP support |
| Kingfisher | iOS | Swift | Modern Swift API, SwiftUI support, processor pipeline |
| Nuke | iOS | Swift | Performance-focused, prefetch support, pipeline architecture |
- Downsample at decode time. Never decode a 4000x3000 image to display in a 200x200 thumbnail. All major libraries do this if you provide the target size.
- Pre-fetch. When the user is viewing item 10 in a list, start loading images for items 15-20. Glide’s
preload()and Nuke’sImagePrefetchersupport this. - Progressive JPEG. Display a blurry version immediately, sharpen as more data arrives. Instagram uses this — you see the image “loading in” from blurry to sharp.
- WebP/AVIF. 25-35% smaller than JPEG at equivalent quality. Supported on Android 4.0+ and iOS 14+. Serve from CDN with format negotiation.
5.4 Memory Leak Detection
Memory leaks on mobile are insidious. Unlike a server that can be restarted, a leaked Activity or ViewController accumulates over time as the user navigates, eventually causing the app to be killed by the OS. Common leak patterns:- Activity/Fragment held by a long-lived reference. A singleton retains a reference to an Activity context. The Activity cannot be garbage collected even after the user navigates away. Fix: use Application context for singletons, or use
WeakReference. - Callback/listener not unregistered. Register a listener in
onResume, forget to unregister inonPause. The listener retains the Activity. - Inner class retaining outer class. A non-static inner class (Java) or a closure/block (Swift) implicitly captures
this/self. If that inner class/closure is passed to a long-lived object (like a network callback), it retains the enclosing Activity/ViewController. - RxJava/Combine subscription not disposed. Observable subscriptions keep the subscriber alive. Dispose in
onDestroy/deinit, or useviewModelScope/lifecycleScopeon Android,.store(in: &cancellables)on iOS.
- LeakCanary (Android): Automatically detects Activity/Fragment leaks in debug builds. Zero configuration. It watches destroyed Activities and alerts if they are not garbage collected within 5 seconds.
- Instruments > Leaks (iOS): Xcode’s profiling tool. Run the app, exercise navigation, check for leaked objects.
- Android Studio Memory Profiler: Visualize memory allocation in real-time, force GC, capture heap dumps.
5.5 Binary Size Optimization
App size directly affects install conversion rate. Google’s data shows that for every 6MB increase in APK size, install conversion drops by 1%. For users on limited data plans or low-storage devices (still common in markets like India, Southeast Asia, and Africa), a 100MB app is a non-starter. Android strategies:- Android App Bundles (AAB): Upload a bundle; Google Play generates optimized APKs per device (correct density, ABI, language). Saves 20-40% vs universal APK.
- R8/ProGuard: Minifies code, removes unused classes and methods. Can reduce DEX size by 30-50%.
resConfigs: Only include the languages your app actually supports. A default Android project includes resources for every language, adding unnecessary size.- Vector drawables over PNGs: A vector icon is 1-2KB. The same icon as PNG at all densities is 20-40KB.
- App Thinning: App Store delivers device-specific binaries. @1x assets go to old devices, @3x to iPhones with Retina HD.
- On-Demand Resources: Assets downloaded after install when needed, automatically purged by the OS when storage is low.
- Bitcode (deprecated in Xcode 14): Allowed Apple to re-optimize binary for new CPU architectures. No longer relevant but may come up in interviews about historical context.
6. Offline-First Architecture
Offline-first is not just “caching.” It is an architecture where the app works fully without a network connection, and sync happens when connectivity is available. It is dramatically harder than server-first architecture, but for certain categories of apps, it is the difference between usable and unusable.6.1 The Core Pattern: Local-First Data
6.2 Conflict Resolution Strategies
When two devices edit the same data offline and then sync, conflicts are inevitable. There is no magic solution — only trade-offs:- Last Write Wins (LWW)
- Merge
- CRDTs (Conflict-Free Replicated Data Types)
- Operational Transform (OT)
6.3 Local Databases for Mobile
| Database | Platform | Type | Strengths | Best For |
|---|---|---|---|---|
| Room | Android | SQLite wrapper | Type-safe queries, compile-time verification, LiveData/Flow integration | Structured relational data on Android |
| Core Data | iOS | Object graph | Deep Apple integration, iCloud sync, NSFetchedResultsController for UI binding | iOS apps in the Apple ecosystem |
| Realm | Cross-platform | Object database | Live objects (auto-updating), easy schema, cross-platform | Cross-platform apps needing real-time sync |
| SQLite (direct) | Both | Relational | Maximum control, smallest overhead, no wrapper overhead | Performance-critical or custom query patterns |
| MMKV | Both | Key-value | Extremely fast (mmap-based), 100x faster than SharedPreferences | Preferences, small config values, caching tokens |
6.4 Sync Protocols and Patterns
Queue-based offline operations:Interview: Design an offline-capable note-taking app
Interview: Design an offline-capable note-taking app
-
Local storage. Room (Android) or Core Data (iOS) as the primary database. Every note has:
localId(UUID, generated on device),serverId(nullable, assigned after first sync),content,lastModified(local timestamp),syncStatus(enum: synced, pending, conflicted). -
Write path. All writes go to the local database first. UI updates immediately from the local database. A
SyncWorker(using WorkManager on Android, BGTaskScheduler on iOS) processes pending writes when connectivity is available. -
Sync protocol. Client sends changed notes since last sync (tracked by a
lastSyncTimestamp). Server responds with changes from other devices since the same timestamp. This is a delta sync — only changed notes transfer, not the full dataset. - Conflict resolution. For a note-taking app, I would use field-level merge. If Device A changed the title and Device B changed the body, merge both changes. If both changed the same field, present a conflict UI showing both versions and let the user choose — or default to last-write-wins with the option to view history.
-
Edge cases I would address:
- Note created on two devices with the same local UUID. Extremely unlikely with UUID v4 but handle it: the server assigns unique server IDs regardless.
- Note deleted on Device A while edited on Device B. The delete wins, but the edited version is preserved in a ‘recently deleted’ view for recovery.
- Large attachments (images). Sync metadata immediately, download attachments lazily. Do not block text sync on image upload.
- Scaling the sync. For users with thousands of notes, a full delta sync becomes expensive. Introduce pagination: sync the 50 most recently modified notes first, then backfill older notes in the background.”
- Treating the server as the source of truth instead of the local database
- Ignoring conflict resolution (“we will just use timestamps”)
- Not handling the local-to-server ID mapping
- Forgetting that the sync can fail partway through
- “I would just cache the API responses locally.” — Caching is not offline-first. Caching gives you read-only offline access. Offline-first means the app is fully functional (read and write) without a network connection.
- “We can use last-write-wins for everything.” — Silently discarding user edits is unacceptable in a note-taking app. Users will lose work and lose trust.
- “Sync is easy, just POST the changes when the network comes back.” — Ignores conflict resolution, partial sync failure, ID mapping, and the fact that the server may have received changes from other devices in the meantime.
- “The local database is the source of truth, not the server. The UI always reads from local. The server is a sync target, and the sync engine is a background process that runs independently of user interaction.”
- “For a note-taking app, I would use field-level merge for conflict resolution. Title and body are separate fields — if Device A edits the title and Device B edits the body, both changes are preserved without conflict. Same-field conflicts get surfaced to the user.”
- “The hardest part of offline-first is not the sync — it is the ID mapping. Entities created offline reference each other by local IDs. After sync, every reference must be updated to server IDs without breaking the relational integrity of the local database.”
- Failure mode: “A sync fails midway through — 3 of 5 operations succeed, 2 fail. Without idempotent sync operations, retrying creates duplicates for the 3 that already succeeded. Every sync operation must be idempotent, identified by a client-generated UUID that the server uses for deduplication.”
- Rollout: “Ship offline support incrementally: Phase 1 is offline read (cached data). Phase 2 is offline write for new entities. Phase 3 is offline edit for existing entities. Phase 4 is conflict resolution UI. Each phase ships behind a flag.”
- Rollback: “If the sync engine has a bug that corrupts data, the feature flag disables offline writes and reverts to server-first mode. Local unsynced changes are preserved in the queue but not processed until the fix ships.”
- Measurement: “Track sync success rate, average sync latency, conflict rate (what percentage of syncs produce conflicts), and data loss incidents (user reports of missing edits). Conflict rate above 5% suggests the resolution strategy needs refinement.”
- Cost: “Offline-first adds 40-60% to the initial data layer development cost compared to server-first. But it reduces ongoing support costs because the app is resilient to backend outages and network issues. Model the 2-year TCO, not just the build cost.”
- Security/Governance: “Offline data persists on device in an unencrypted local database by default. For sensitive data (medical notes, legal documents), encrypt the local database with SQLCipher or encrypted Core Data. The encryption key should be stored in Keychain/Keystore, not derived from user input.”
AI-Assisted Engineering Lens: Offline-First Architecture
AI-Assisted Engineering Lens: Offline-First Architecture
Part III — Mobile Infrastructure
7. Push Notifications
Push notifications are the most abused and least understood feature in mobile engineering. They seem simple — send a message, the phone shows it. In reality, the delivery pipeline is complex, guarantees are weak, and misuse destroys user engagement.7.1 Architecture: APNs and FCM
APNs (Apple Push Notification service):- Your server authenticates to APNs using either a certificate or a JWT token
- You send a JSON payload (max 4KB) to a device token
- APNs delivers the notification to the device — eventually, with no delivery guarantee
- If the device is offline, APNs stores the most recent notification per topic (not all of them) and delivers it when the device reconnects
- Notification coalescing: APNs may combine multiple notifications into one if the device is offline for a long time
- Your server sends a message to FCM’s HTTP API using a server key or service account
- FCM delivers to the device via Google Play Services
- Two message types: notification messages (FCM handles display) and data messages (your app handles everything)
- On Android, data messages are delivered even when the app is killed (within doze mode constraints). On iOS, data messages require the app to have background modes enabled.
- Topic messaging: Subscribe devices to topics (e.g., “breaking-news”). Send once to the topic, FCM delivers to all subscribers.
7.2 Silent Push for Background Sync
Silent push notifications wake your app in the background without showing anything to the user. This is how chat apps fetch new messages, how email apps sync mailboxes, and how news apps pre-fetch content. iOS: Setcontent-available: 1 in the push payload. The system wakes your app and gives it ~30 seconds of background execution time. But: iOS throttles silent push — if you send too many, the OS will stop waking your app.
Android: Send a data-only FCM message (no notification field). Your FirebaseMessagingService.onMessageReceived() runs, even in the background. More reliable than iOS for background processing, but still subject to doze mode delays.
7.3 Notification Permission Strategy
On iOS, you get one chance to ask for notification permission. If the user declines, the only way to re-ask is to direct them to Settings — which almost no one does. Permission request timing is critical. Best practices:- Do not ask on first launch. The user has no relationship with your app yet. Permission rates for first-launch requests are 40-50%. Pre-primed requests (after the user has experienced value) get 60-80%.
- Use a pre-permission screen. Show a custom UI explaining the value of notifications (“Get notified when your order ships”) before triggering the system dialog. If the user declines your custom UI, you have not burned the system prompt.
- Respect the decline. If a user declines, do not ask again for at least 30 days. And when you do, provide a new, compelling reason.
- Notification channels (Android 8+). Group notifications into channels (Messages, Promotions, Order Updates). Users can disable specific channels without disabling all notifications.
8. Mobile Networking
8.1 API Design for Mobile
Mobile-optimized APIs differ from web APIs in several ways: Pagination: Infinite scroll feeds need cursor-based pagination, not offset-based. Offset pagination breaks when items are added or removed between pages (the user sees duplicates or misses items). Cursor-based pagination uses a stable pointer (usually the ID of the last item) to fetch the next page.8.2 Certificate Pinning
Certificate pinning ensures your app only communicates with your server, not an impersonator. Without pinning, any Certificate Authority (CA) can issue a certificate for your domain, and a man-in-the-middle attacker with a rogue CA certificate can intercept all traffic. How it works: Your app embeds the expected server certificate (or its public key hash). During the TLS handshake, the app compares the server’s certificate against the pinned value. If they do not match, the connection is rejected. The operational trap: If your pinned certificate expires and you have not shipped an app update with the new certificate, your app stops working. Completely. Users cannot even reach your server to get the update. This has caused real outages. Mitigation:- Pin the public key, not the certificate. Public keys survive certificate rotation.
- Pin at least two keys (primary and backup).
- Include a long-lived backup pin for a certificate you have not deployed yet.
- Implement a kill switch: a feature flag that disables pinning if you make a mistake.
8.3 GraphQL on Mobile
GraphQL is particularly well-suited to mobile because it solves the over-fetching and under-fetching problems that plague REST on bandwidth-constrained connections. Apollo Client (iOS/Android): The dominant GraphQL client for mobile. Features include normalized caching (two queries that return the same user get deduplicated in cache), optimistic mutations (UI updates before server confirms), and code generation from GraphQL schemas. The tradeoff on mobile: GraphQL queries are larger than REST URLs (you send the query string with every request). This matters on very slow connections. The mitigation is persisted queries — the client sends a hash of the query, the server looks up the full query. This reduces request size to a few bytes.8.4 gRPC on Mobile
gRPC uses Protocol Buffers (binary serialization) and HTTP/2. On mobile, this means smaller payloads and faster parsing than JSON REST. When gRPC makes sense on mobile:- High-frequency real-time data (streaming stock prices, location updates)
- Large payloads where Protobuf’s binary encoding saves significant bandwidth
- When the backend team already uses gRPC and you want type-safe contracts
- Simple CRUD apps where the Protobuf/gRPC setup overhead is not justified
- When you need to debug network traffic easily (binary Protobuf is not human-readable)
- When CDN caching is important (gRPC over HTTP/2 is harder to cache at CDN edge nodes)
9. Mobile CI/CD and Release
9.1 The App Store Bottleneck
Unlike web deployment where you push and it is live in seconds, mobile releases go through a gatekeeper:| Aspect | Apple App Store | Google Play Store |
|---|---|---|
| Review time | 24-48 hours (sometimes longer) | 2-7 days (increased in recent years) |
| Rejection rate | ~30% of first submissions (Apple’s 2023 data) | Lower, but increasing |
| Common rejections | Crashes, broken links, guideline violations (IAP rules, privacy), metadata issues | Policy violations, targeting API level, privacy declarations |
| Expedited review | Available (request via App Store Connect) | Not officially available |
| Phased rollout | Yes (1%, 2%, 5%, 10%, 20%, 50%, 100% over 7 days) | Yes (custom percentages) |
9.2 Over-the-Air (OTA) Updates
OTA updates let you push JavaScript/asset updates to mobile apps without going through App Store review. This is specific to apps with a JavaScript runtime (React Native) or asset-based content. CodePush (React Native):- Push JS bundle updates directly to devices
- Users get the update on next app launch (or even immediately with a mandatory update)
- Does not work for native code changes (adding a new native module requires a store release)
- Microsoft-owned (part of App Center), future uncertain as App Center was retired in March 2025
- Similar to CodePush but integrated with the Expo ecosystem
- EAS Update provides hosted update infrastructure
- Supports update channels (production, staging, preview)
9.3 Feature Flags for Mobile
Feature flags on mobile are more critical than on web because you cannot roll back a release. Once a user has version 3.2.0, you cannot force them to downgrade. Mobile-specific feature flag considerations:- Version targeting. Flag should be evaluable by app version. “Enable new checkout for version >= 3.5.0” is essential for gradual migrations.
- Offline evaluation. The app must be able to evaluate flags without a network connection. Cache flag values locally and refresh periodically.
- Kill switches. Every risky feature should have a flag that can disable it remotely. Ship the feature behind a flag, enable it for 1% of users, monitor crash rates, and roll out gradually.
- Stale flags. Mobile apps in the wild may have flag values cached for weeks (if the user does not open the app). Set a maximum cache TTL and force a refresh on app foreground.
9.4 Crash Reporting
Crashlytics (Firebase): The industry standard for mobile crash reporting. Automatic crash grouping, affected user count, version breakdown, and breadcrumbs (logs leading up to the crash). Free. Integrates with Android and iOS natively. Sentry: More powerful than Crashlytics for detailed error context, performance monitoring, and release health tracking. Supports React Native, Flutter, and native. Paid (with a generous free tier). Key metrics to monitor:- Crash-free rate: Target > 99.5% for a healthy app. > 99.9% for a well-maintained app. Below 99% is a serious quality problem.
- Crash-free users: More meaningful than crash-free sessions. One user crashing 50 times is worse than 50 users crashing once each.
- ANR rate (Android): Application Not Responding — the main thread is blocked for > 5 seconds. Google Play Console shows this. Target < 0.5%.
10. Mobile Security
10.1 Secure Storage
- iOS Keychain
- Android Keystore
kSecAttrAccessibleWhenUnlocked— Available only when device is unlocked. Default, use for most tokens.kSecAttrAccessibleAfterFirstUnlock— Available after first unlock until reboot. Use for background sync tokens.kSecAttrAccessibleWhenUnlockedThisDeviceOnly— Same as above but not backed up to iCloud.
10.2 Root/Jailbreak Detection
Rooted (Android) or jailbroken (iOS) devices bypass the OS’s security model. On a rooted device, any app can read any other app’s data, including your Keychain/Keystore entries. Detection techniques (Android):- Check for
subinary in common paths (/system/bin/su,/system/xbin/su) - Check for root management apps (SuperSU, Magisk Manager)
- Use SafetyNet/Play Integrity API (Google’s attestation service)
- Check
Build.TAGSfor “test-keys” (indicates a non-official build)
- Check for Cydia app (jailbreak app store)
- Attempt to write to a restricted path (
/private/jailbreaktest) - Check if
fork()succeeds (sandboxed apps cannot fork) - Use DeviceCheck API for Apple’s attestation
10.3 App Attestation
Apple DeviceCheck / App Attest:- DeviceCheck: Set and query per-device bits on Apple’s servers. Use for fraud prevention (mark a device as having already redeemed a free trial).
- App Attest: Cryptographic proof that the request comes from a legitimate, unmodified version of your app running on a real Apple device. Uses the Secure Enclave to generate assertions.
- Returns a verdict: is this a genuine device, running a genuine copy of your app, with a genuine Google Play account?
- Three verdicts:
MEETS_DEVICE_INTEGRITY(genuine device),MEETS_BASIC_INTEGRITY(may be rooted),NO_INTEGRITY(emulator or tampered) - Use the verdict server-side to gate sensitive operations (payments, account creation)
10.4 Biometric Authentication
Implementation pattern:- User registers biometric during initial setup (fingerprint or Face ID/face unlock)
- App generates a key pair in the hardware security module (Keystore/Keychain)
- The private key requires biometric authentication to use
- On subsequent auth, the app prompts biometric, gets access to the private key, signs a challenge from the server
- The server verifies the signature with the stored public key
Interview: How do you secure sensitive data in a mobile application?
Interview: How do you secure sensitive data in a mobile application?
- Data at rest. Use the platform’s secure storage: Keychain on iOS, Keystore + EncryptedSharedPreferences on Android. Never store tokens in plain UserDefaults/SharedPreferences. For structured data, encrypt the local database (SQLCipher for SQLite, encrypted Core Data).
- Data in transit. Enforce TLS 1.2+ for all network communication. Implement certificate pinning for the most sensitive endpoints (authentication, payments). Pin the public key, not the certificate, and include backup pins.
- App binary. Enable code obfuscation (R8/ProGuard on Android; Swift’s lack of a mature obfuscation tool is a known gap — use third-party tools like SwiftShield for high-security apps). Strip debug symbols from release builds. Never embed secrets in the binary.
- Additional layers for high-security apps: Root/jailbreak detection (block or warn). App attestation (Play Integrity, App Attest) to verify the app is genuine. Biometric-gated key access for sensitive operations. Runtime tampering detection (detect debuggers, hooking frameworks like Frida).
- The meta-principle: Assume the device is compromised. The client is untrusted. Every security-critical decision must be validated server-side. Client-side checks are speed bumps that raise the bar for attackers, but the server is the actual enforcement point.”
- Storing tokens in plain SharedPreferences/UserDefaults
- Embedding API keys in the binary
- Relying solely on client-side root detection without server-side validation
- Not implementing certificate pinning for authentication endpoints
- “We encrypt everything with AES-256.” — Encryption without proper key management is security theater. Where is the key stored? How is it protected? If the key is in the app binary, the encryption is worthless.
- “We use HTTPS so the data is secure.” — HTTPS protects data in transit but says nothing about data at rest or binary security. A decompiled app can reveal hardcoded tokens regardless of transport security.
- “Root/jailbreak detection will prevent attacks.” — Detection is a speed bump, not a wall. Magisk hides root from most detection methods. The real defense is server-side validation.
- “I think about mobile security in three threat surfaces: data at rest (Keychain/Keystore, encrypted databases), data in transit (TLS 1.2+ with certificate pinning on sensitive endpoints), and the binary itself (obfuscation, no embedded secrets, attestation). Each layer assumes the other two might be compromised.”
- “The most important principle is: the client is untrusted. Every security-critical decision is validated server-side. Client-side checks (root detection, biometric gates, certificate pinning) raise the attacker’s cost but are not the actual security boundary.”
- “For a fintech app, I would implement defense in depth: hardware-backed key storage for tokens, certificate pinning with backup pins and a kill switch, Play Integrity / App Attest for binary attestation, and biometric-gated key access for transactions over $100.”
- Failure mode: “A team pins the leaf certificate instead of the public key. The certificate rotates on schedule, and the app cannot connect to the server. 100% of users on the current version are locked out. The fix requires a new app version, but users cannot download it because the app cannot reach the update-check endpoint.”
- Rollout: “Ship certificate pinning behind a feature flag. Enable for 1% of users. Monitor connection success rate (not just crash-free rate — pinning failures are connection failures, not crashes). Expand only after 7 days of 100% connection success.”
- Rollback: “The feature flag disables pinning and falls back to standard TLS validation. The kill-switch endpoint must itself be unpinned — hosted on a separate domain or using a separate URL path excluded from the pinning configuration.”
- Measurement: “Track: connection success rate per endpoint, certificate pinning failure rate, Play Integrity / App Attest pass rate, and ‘time to detect compromised token’ (how quickly your server-side monitoring catches a stolen token being used from a different device).”
- Cost: “Certificate pinning has a non-trivial operational overhead: pin rotation must be coordinated with app releases, backup pins must be managed, and the kill switch must be tested regularly. For a team without a dedicated security engineer, the operational risk may outweigh the security benefit for non-financial apps.”
- Security/Governance: “For apps handling health data (HIPAA), financial data (PCI DSS), or European user data (GDPR), security architecture decisions must be documented in a threat model and reviewed by compliance. Auditors will ask: where are keys stored, how is data encrypted at rest, is certificate pinning implemented, and is the app attested.”
core-security) that provides Keychain/Keystore wrappers, certificate pinning configuration, root detection, and attestation as a reusable library. They write the threat model document, conduct security architecture reviews for other teams, coordinate with the security team on penetration testing, and define the security quality bar (e.g., “no app ships without certificate pinning on auth endpoints and encrypted local storage for tokens”). The staff engineer also manages the operational lifecycle of security infrastructure — certificate rotation schedules, pin update timelines, and incident response runbooks for security failures.isBiometricVerified = true in SharedPreferences after successful biometric check. Identify all the security issues with this approach, propose a corrected implementation, and explain the threat model each fix addresses. You have 10 minutes.”AI-Assisted Engineering Lens: Mobile Security
AI-Assisted Engineering Lens: Mobile Security
Part IV — Mobile System Design and Career
11. Mobile System Design Interview Patterns
Mobile system design interviews differ from backend system design in critical ways:- The interviewer expects you to discuss client-side architecture, not just server APIs
- Offline behavior is always relevant — even if the interviewer does not mention it
- Battery and data usage are constraints you should raise proactively
- Platform-specific decisions (which lifecycle to hook into, which storage to use) show depth
11.1 Design a Chat Application (Mobile Client)
Full System Design: Mobile Chat Client
Full System Design: Mobile Chat Client
- 1:1 only, or group chat? What is the maximum group size?
- Do we need end-to-end encryption?
- Offline messaging support?
- Media sharing (images, video)?
- Read receipts, typing indicators?
- Real-time delivery: WebSocket. Maintain a persistent WebSocket connection for incoming messages. The connection reconnects automatically on network change. Messages arrive as events and are immediately written to the local database.
-
Message sending flow:
- User taps send → message written to local DB with status
SENDING→ UI updates immediately (optimistic) - Message queued for network delivery
- WebSocket (if connected) or HTTP POST (if WebSocket is down) sends to server
- Server acknowledges → status updated to
SENT - Recipient reads → server notifies → status updated to
READ - If send fails → status updated to
FAILED, retry button shown
- User taps send → message written to local DB with status
-
Offline support:
- All messages stored in Room/Core Data. Chat is readable offline.
- Outgoing messages queued in a persistent send queue.
- On reconnect, the queue drains in order. The server handles deduplication via client-generated message IDs.
-
Message ordering:
- Each conversation has a server-assigned sequence number.
- Client sorts by sequence number, not by local timestamp (which can be wrong if clocks are skewed).
- Gap detection: if the client receives sequence 42 and then 44, it requests 43 from the server.
-
Image/media sharing:
- Upload image to CDN/S3, get URL.
- Send message with media URL, not the binary data.
- Recipient’s client downloads and caches the image.
- Show thumbnail placeholder during download.
-
Battery efficiency:
- Use silent push notifications to wake the app for new messages when the WebSocket is disconnected (app backgrounded).
- Batch presence updates (typing indicators) — do not send on every keystroke, debounce to 2-3 second intervals.
- Close the WebSocket after extended background time; rely on push for delivery.
- For a conversation with 500 members, every message triggers 500 push notifications. Use topic-based FCM/APNs to avoid your server sending 500 individual pushes.
- Message pagination: load the 50 most recent messages on screen open, load older messages on scroll-up.
- Unread counts: maintain a per-conversation unread count in the local database, updated by WebSocket events. Do not query the server for unread counts on every screen load.
- Failure mode: “WebSocket disconnects silently in a tunnel. The user sends a message, it enters the local queue, but the queue processor assumes the WebSocket is connected and drops the message. Fix: the send path must check actual connection state, not just WebSocket object existence. Fall back to HTTP POST for queued messages.”
- Rollout: “Ship E2EE behind a flag for opt-in beta users first. Encryption changes are irreversible — once messages are encrypted, rolling back means users lose access to encrypted message history. Phase 1: encrypt new messages only, display old messages as plaintext. Phase 2: migrate history.”
- Rollback: “For non-E2EE features (typing indicators, read receipts), standard feature flag rollback. For E2EE: you cannot ‘un-encrypt’ — instead, disable encryption for new messages and provide a fallback decryption path for already-encrypted messages using cached keys.”
- Measurement: “Message delivery latency (p50, p95, p99), message delivery success rate (should be >99.5%), offline queue drain time, WebSocket reconnection time, and unread count accuracy (compare server-side and client-side counts).”
- Cost: “E2EE adds 30-50% to the messaging feature’s development cost. Key management (multi-device, key rotation, backup) is a permanent operational burden. Battery cost: encryption/decryption of every message adds CPU usage, though hardware-accelerated AES on modern chips makes this negligible.”
- Security/Governance: “Chat apps handling user PII must comply with data retention regulations. E2EE complicates compliance: if the server cannot read messages, it cannot apply content moderation, legal hold, or data export requests. Some jurisdictions require lawful intercept capability, which conflicts with E2EE. Design the key architecture to support ‘compliance mode’ for enterprise customers.”
11.2 Design a Social Media Feed with Infinite Scroll
Full System Design: Social Feed
Full System Design: Social Feed
11.3 Design a Mobile Payment Flow
Full System Design: Payment Flow
Full System Design: Payment Flow
- In-app purchase or processing external payments?
- Stored payment methods or one-time entry?
- What regulatory requirements (PCI DSS, PSD2, local regulations)?
- Never handle raw card numbers. Use a tokenization provider (Stripe, Braintree, Adyen). The user enters card details in the provider’s SDK-rendered UI. The SDK sends card data directly to the provider’s servers, never touching your server. You receive a one-time token that represents the card.
- Idempotency is non-negotiable. Mobile networks drop connections. The user taps “Pay” and the request times out. Did the payment go through? Without idempotency, retrying might charge them twice. Generate a client-side idempotency key (UUID) before the first attempt. Send it with every retry. The server uses it to deduplicate.
- Optimistic UI is dangerous here. Unlike a chat message where showing it immediately is fine, showing “Payment successful” before server confirmation is risky. Use a processing state: “Processing your payment…” → poll or listen for server confirmation → “Payment successful” or “Payment failed.”
- The payment state machine:
-
3D Secure / Strong Customer Authentication (SCA):
- European PSD2 regulation requires 3D Secure for many payments.
- The payment flow must handle a redirect: the payment SDK opens a WebView or in-app browser for the bank’s authentication page, then returns to the app with the result.
- This redirect-and-return is the most fragile part of mobile payments. Test with slow networks, process death during the redirect, and the user killing the WebView.
-
Security:
- Biometric confirmation before payment (Face ID / fingerprint).
- Certificate pinning on payment endpoints.
- No payment data in local logs or crash reports.
- Rate limiting on the client: disable the pay button after tap to prevent double submission.
-
Receipts and confirmation:
- Store transaction records locally for offline access.
- Send push notification on payment success.
- Send email receipt as a backup.
- Failure mode: “The network drops during 3D Secure redirect. The user completed bank authentication, but the app never received the callback. The payment is charged but the app shows ‘Payment failed.’ Fix: on app resume, check the payment status server-side using the idempotency key before showing a result. Never trust the client-side callback alone.”
- Rollout: “Payment flow changes are the highest-risk mobile changes. Rollout: 0.1% for 48 hours (monitor successful transaction rate, not just crash-free rate), then 1%, 5%, 20%, 100%. Any drop in transaction success rate triggers immediate halt.”
- Rollback: “Feature flag reverts to the previous payment flow. Both payment UIs coexist in the binary. The server-side payment processing is version-agnostic — it accepts requests from both old and new client versions.”
- Measurement: “Transaction success rate (target >98%), payment latency (time from tap to confirmation), double-charge rate (should be 0% with idempotency keys), 3D Secure completion rate, and abandoned cart rate at the payment step.”
- Cost: “Each payment provider charges 2.9% + $0.30 per transaction (Stripe standard). The engineering cost of switching providers is high due to SDK integration, certification, and testing. Evaluate providers on: fraud detection quality, 3D Secure support, and mobile SDK quality — not just transaction fees.”
- Security/Governance: “PCI DSS compliance requires that raw card numbers never touch your servers or your app’s code. Using the payment provider’s SDK-rendered UI (Stripe Elements, Braintree Drop-in) is the simplest path to PCI compliance. If you build a custom card entry UI, you take on PCI SAQ D compliance — a dramatically more expensive audit burden. Do not do this unless you have a dedicated security team.”
11.4 Design a Ride-Sharing Rider Experience
Full System Design: Ride-Sharing Rider App
Full System Design: Ride-Sharing Rider App
- Map with location: MapKit (iOS) or Google Maps SDK (Android). Show the user’s location, nearby drivers (updated every 3-5 seconds via WebSocket), and route preview.
-
Location tracking:
- GPS updates at 1-second intervals during active ride.
- Background location updates (iOS background location mode, Android foreground service with notification).
- Battery optimization: reduce GPS frequency to 30 seconds when the app is in the background and the user is not in an active ride.
-
Real-time driver tracking:
- WebSocket connection receives driver location updates.
- Smooth animation: interpolate between GPS points to avoid “jumping” icon. Use a hermite spline or linear interpolation over 1-second intervals.
- When WebSocket disconnects (tunnel, elevator), fall back to polling every 5 seconds.
-
ETA updates:
- Recalculate ETA on every driver location update.
- Use server-side routing (Google Directions API or Mapbox) for accurate ETA based on current traffic.
- Show uncertainty: “Arriving in 4-6 minutes” rather than “Arriving in 5 minutes.”
-
Offline resilience:
- Cache the last-known driver location and ETA.
- If the connection drops mid-ride, show “Reconnecting…” and continue displaying the last-known position.
- The ride continues on the server side regardless of the client’s connectivity.
-
Push notifications:
- “Your driver is arriving” (silent push triggers local notification).
- “Your ride has ended” with fare summary.
- “Rate your driver” (30 minutes after ride ends).
- MVVM with a
RideRepositorythat abstracts the WebSocket + REST API. - A
LocationManagerwrapper that handles permission requests, accuracy levels, and battery optimization. - A
MapViewModelthat fuses driver location, user location, and route data into a single UI state.
- Failure mode: “GPS location jumps erratically in urban canyons (tall buildings). The driver icon teleports across the map, making the ETA meaningless. Fix: apply a Kalman filter to smooth GPS readings, discard outliers (speed >200km/h between consecutive readings), and interpolate between valid points.”
- Rollout: “Ship map and location changes behind a flag. A/B test the new interpolation algorithm against the old one, measuring user-reported ‘driver location accuracy’ complaints and ETA accuracy (predicted vs actual arrival time).”
- Rollback: “Feature flag reverts to the previous location rendering logic. The location data pipeline (GPS -> WebSocket -> server) is independent of the rendering approach.”
- Measurement: “ETA accuracy (mean absolute error between predicted and actual arrival), location update latency (time from driver GPS reading to rider screen update), battery drain during active ride, and user satisfaction (star ratings correlated with ETA accuracy).”
- Cost: “Continuous GPS at 1-second intervals during an active ride drains 10-15% battery per hour. For a 30-minute ride, that is 5-7.5% battery — acceptable for the rider but a significant cost for drivers who are in rides all day. Drivers need a lower-power location mode between rides.”
- Security/Governance: “Real-time location sharing raises privacy concerns. Location data must be encrypted in transit, retained only for the duration of the ride (plus a safety buffer), and accessible only to the matched rider/driver pair. GDPR requires that users can request deletion of their ride history, including all location data.”
11.5 Design an Offline-Capable Note-Taking App
Full System Design: Offline Notes App
Full System Design: Offline Notes App
- Rich text editing. CRDT-based text state (e.g., Yjs or Automerge) for conflict-free merging across devices.
- Attachments. Images and files stored locally, uploaded asynchronously when online. Reference by local URI initially, replace with CDN URL after upload.
- Search. Full-text search index on the local database (FTS5 in SQLite/Room, NSPersistentContainer with derived attributes in Core Data). Search works offline.
- Sharing and collaboration. Real-time collaboration when online (WebSocket-based, OT or CRDT). Offline edits by collaborators merge when both come online.
- Sync efficiency. Delta sync with vector clocks: each device tracks its own version vector. On sync, devices exchange only operations newer than the other’s last-known vector.
Interview: Walk me through how you would design a mobile system for [X]
Interview: Walk me through how you would design a mobile system for [X]
- Clarify requirements. Platforms (iOS, Android, both?), offline needs, performance targets, scale (users, data volume).
- Client architecture. Choose an architecture pattern (usually MVVM for most apps). Define the data flow: API → Repository → ViewModel → UI. Show where caching happens (repository level).
- Networking. How does data flow to/from the server? REST or GraphQL for CRUD, WebSocket for real-time. How do we handle offline? Queue-based sync, optimistic UI, conflict resolution.
- Platform-specific decisions. Storage (Room/Core Data), background work (WorkManager/BGTaskScheduler), notifications (FCM/APNs), navigation (Jetpack Navigation/UINavigationController).
- Performance and constraints. Startup time strategy, image loading, memory management, battery optimization. These are the details that signal mobile expertise vs ‘I watched a system design video.’
- Failure mode: “A candidate designs a beautiful client architecture but forgets that the app will be killed by the OS while processing a critical operation (uploading a payment, syncing a document). Every mobile system design must answer: ‘What happens if the OS kills this process mid-operation?’”
- Rollout: “In a system design interview, mention staged rollout unprompted. ‘I would ship this behind a feature flag with a 1% rollout, monitoring crash-free rate and the key business metric for 48 hours before expanding.’ This signals production awareness.”
- Rollback: “Mention your rollback strategy for every major component. ‘If the WebSocket connection manager causes battery drain, the feature flag falls back to polling. If the new caching layer corrupts data, the flag disables it and falls back to network-only reads.’”
- Measurement: “Define success metrics for your design: performance targets (startup <1s, scroll 60fps), reliability targets (crash-free >99.5%), and business targets (engagement, conversion). Interviewers want to see that you think about measurable outcomes, not just architecture diagrams.”
- Cost: “Proactively discuss the operational cost: ‘This design requires N API calls per session, which at 1M DAU translates to N million API calls per day. Here is how I would reduce that with caching and batching.’”
- Security/Governance: “For any design involving user data, mention: ‘Sensitive data is encrypted at rest using Keychain/Keystore, transmitted over TLS with pinning on sensitive endpoints, and retained according to our data retention policy.’ This signals security-first thinking.”
AI-Assisted Engineering Lens: Mobile System Design
AI-Assisted Engineering Lens: Mobile System Design
12. Mobile Testing Strategy
12.1 Testing Pyramid for Mobile
- Unit tests (70%): ViewModel logic, use cases, utilities, formatters. These run in milliseconds, need no device/emulator, and catch logic bugs.
- Integration tests (20%): Repository + fake API, database operations, navigation flows. These need a JVM (Android) or Simulator (iOS) but are still fast.
- UI/E2E tests (10%): Full user flows on real devices or emulators. Slow, flaky, but catch integration issues that other tests miss.
12.2 Unit Testing
Android (JUnit + MockK/Mockito):12.3 UI Testing
Espresso (Android):12.4 Snapshot Testing
Snapshot tests capture a rendered screenshot of a UI component and compare it against a reference image. Any pixel difference fails the test. Why snapshot testing matters on mobile:- Catches unintentional UI regressions (a padding change, a wrong color, a broken layout)
- Faster than UI tests (no navigation, no interaction, just render and compare)
- Especially valuable for design system components — ensures every button, card, and dialog renders exactly as designed
swift-snapshot-testing (iOS, from Point-Free), Paparazzi (Android, from Cash App — renders without a device/emulator), Screenshot Testing for Android (Facebook).
The tradeoff: Snapshot tests are brittle. Any intentional UI change requires updating all affected snapshots. For a component library with 200 snapshots, a design refresh means regenerating 200 reference images. Use snapshot tests for stable components, not for screens that change frequently.
12.5 Device Lab and Real Device Testing
Emulators/Simulators vs Real Devices:| Aspect | Emulator/Simulator | Real Device |
|---|---|---|
| Speed | Fast to spin up | Requires physical hardware |
| Cost | Free | $200-1000+ per device |
| Accuracy | 95% — misses GPU rendering issues, Bluetooth, NFC, camera | 100% — the real thing |
| CI/CD | Easy to run in cloud CI | Requires device farms (Firebase Test Lab, BrowserStack, AWS Device Farm) |
| When to use | Development, unit tests, integration tests, snapshot tests | Final validation, performance testing, hardware-specific features |
AI-Assisted Engineering Lens: Mobile Testing
AI-Assisted Engineering Lens: Mobile Testing
13. Cross-Chapter Connections
Mobile engineering does not exist in isolation. A senior mobile engineer needs to understand how mobile connects to every other system in the stack.Mobile + Backend APIs
Mobile + Real-Time Systems
WebSockets on mobile have platform-specific challenges that do not exist on web:- The OS can kill the WebSocket connection when the app backgrounds (iOS is aggressive about this)
- Network transitions (WiFi to cellular) require reconnection logic
- Battery optimization modes throttle background network activity
Mobile + Authentication
The OAuth flow on mobile is different from web:- Mobile apps cannot securely store client secrets (the binary is decompilable)
- PKCE (Proof Key for Code Exchange) is mandatory — it prevents authorization code interception
- Biometric auth adds a local authentication layer that does not replace server auth — it gates access to the locally stored token
- Token refresh must happen silently in the background; never show a login screen because the access token expired
Mobile + Performance Monitoring
Mobile performance monitoring requires client-side instrumentation:- Startup time tracking (cold/warm/hot start)
- Network request timing (broken down by endpoint, connection type, region)
- Frame rate monitoring (detect jank in real time)
- Crash-free rate and ANR rate
Mobile + Design Systems
The debate between “responsive web” vs “separate mobile app” connects to broader frontend and product strategy:- Responsive web: One codebase, works everywhere. Limited access to device features. Performance ceiling is lower.
- Adaptive web (PWA): Service workers enable offline, push notifications (Android), and installation. Still limited platform API access.
- Separate native apps: Best performance and platform integration. Highest development cost.
- Shared design system: Regardless of implementation choice, maintain a shared design system (tokens, components, patterns) across web and mobile. Figma → code generation tools (Figma to SwiftUI/Compose) are improving rapidly.
Interview Questions
Q1: Explain the difference between cold start, warm start, and hot start on mobile. How would you optimize cold start time?
Q1: Explain the difference between cold start, warm start, and hot start on mobile. How would you optimize cold start time?
-
Define the types clearly.
- Cold start: the process does not exist. The OS creates the process, runs Application/AppDelegate initialization, creates the first Activity/ViewController, and renders the first frame. This is the slowest path, typically 1-3 seconds.
- Warm start: the process exists but the Activity was destroyed (Android). Application.onCreate() is skipped, but Activity.onCreate() runs. Faster because process and class loading are already done.
- Hot start: the app was in the background and is brought to the foreground. onResume() runs, minimal work. Typically under 200ms.
-
Cold start optimization strategy:
- Measure first. Use
adb shell am start -S -Won Android or Instruments’ App Launch template on iOS. Establish a baseline. - Defer SDK initialization. Analytics, crash reporting, and feature flag SDKs do not need to initialize before the first frame. Move them to a background thread or defer to after onResume.
- Lazy dependency injection. Do not construct your entire DI graph at startup. Use lazy initialization — create objects when first accessed.
- Optimize the first screen. The first screen should render from local data (cached, or a static placeholder). Never block the first frame on a network call.
- Profile with systrace/Instruments. Identify the longest blocking operations on the main thread and eliminate or move them.
- Measure first. Use
- Cite a real example. “At Uber, cold start was 5.5 seconds because they were initializing 30+ SDKs synchronously. They reduced it to under 2 seconds by deferring non-critical initialization and using a static splash screen as a loading facade.”
- Conflating cold start with splash screen display time
- Not knowing the measurement tools (Systrace, Instruments,
adb shell am start) - Saying “just move everything to a background thread” without considering which operations must be on the main thread (UI initialization)
- Failure mode: “A team defers all SDK initialization to a background thread but forgets that the analytics SDK must initialize before the first screen tracks a view event. Result: the first screen’s analytics events are dropped for every cold start. Fix: categorize SDKs into ‘must be before first frame’ (crash reporting), ‘must be before first interaction’ (analytics), and ‘can be fully deferred’ (feature flags, A/B testing).”
- Measurement: “Track cold start time per release as a P50 and P95 metric, segmented by device tier (low-end, mid-range, flagship). A 200ms regression on a Pixel 8 might be a 600ms regression on a budget device with an older CPU.”
- Cost: “Every SDK added to the app increases cold start time by 20-100ms. At 10 SDKs, that is 200ms-1s of startup time attributable to third-party code. Audit SDK initialization quarterly and remove unused SDKs.”
Q2: How does React Native's New Architecture differ from the old bridge-based architecture?
Q2: How does React Native's New Architecture differ from the old bridge-based architecture?
- JSI (JavaScript Interface). Instead of JSON serialization, JSI exposes C++ host objects directly to JavaScript. JS code can call native methods synchronously, like function calls, without serialization overhead. This is 10-100x faster for frequent operations.
- Fabric. The new rendering system. It supports concurrent rendering — views can be created and measured on any thread, not just the main thread. This enables interruptible rendering (the UI stays responsive even during complex layout calculations) and reduces the scheduling delays of the old renderer.
- TurboModules. Native modules are now lazy-loaded. The old architecture loaded every native module at startup, even if they were never used. TurboModules load on first access, using JSI for direct communication. This significantly reduces startup time.
- Codegen. Generates type-safe C++ interfaces from a schema. Catches type mismatches at build time instead of runtime — no more ‘undefined is not an object’ crashes from JS-native type mismatches.
- Not knowing the Bridge existed or why it was a problem
- Saying “React Native is just a WebView” (it is not, and has not been since its inception)
- Conflating the New Architecture with specific versions of React Native
Q3: How would you implement offline support for a mobile app?
Q3: How would you implement offline support for a mobile app?
- Principle: local database is the source of truth. The UI always reads from the local database (Room on Android, Core Data on iOS). The server is a sync target, not the primary data source.
- Write path. All user actions write to the local database first. UI updates immediately from the local change. The write is enqueued in a persistent sync queue.
- Sync engine. A background worker (WorkManager on Android, BGTaskScheduler on iOS) processes the sync queue when connectivity is available. Each operation is retried with exponential backoff on failure.
-
Conflict resolution. Choose a strategy based on the data type:
- Last-write-wins for simple values (user preferences, settings).
- Field-level merge for structured entities (merge non-conflicting field changes, prompt user for same-field conflicts).
- CRDTs for collaborative content (text editing, shared lists).
- ID mapping. Entities created offline get local UUIDs. After sync, the server assigns a server ID. Maintain a mapping table and update all references.
- Delta sync. Track a sync timestamp or version vector. On reconnect, request only changes since the last sync, not the entire dataset.
-
Edge cases I always address:
- What happens if the server rejects a synced operation? Show the user an error and offer to discard or retry.
- What if the user deletes something offline that another user modified? Deletion wins, but the modified version is preserved in a recovery area.
- What about large binary files (images, attachments)? Sync metadata immediately, upload binaries asynchronously, show a placeholder until upload completes.”
- Failure mode: “The sync queue processes operations out of order. A ‘delete note’ operation arrives at the server before the ‘create note’ operation. The server rejects the delete (entity does not exist), then processes the create, leaving a note that should have been deleted. Fix: enforce causal ordering in the sync queue — operations on the same entity must be processed in creation order.”
- Measurement: “Track sync success rate (target >99%), average sync latency (time from write to server confirmation), conflict rate per entity type, and ‘stale data duration’ (how long a user sees outdated data before sync completes).”
- Security/Governance: “Offline data persists on device potentially indefinitely. For regulated industries (healthcare, finance), implement a maximum offline data retention period — after N days without sync, prompt the user to connect or automatically purge sensitive data.”
Q4: Your app has a 97% crash-free rate. How do you get it to 99.9%?
Q4: Your app has a 97% crash-free rate. How do you get it to 99.9%?
- Triage by impact. Open Crashlytics/Sentry and sort crashes by affected user count, not occurrence count. One crash affecting 10,000 users matters more than a different crash occurring 10,000 times for 50 users (the latter is likely one user hitting the same crash repeatedly).
-
Categorize the top 10 crashes:
- Null pointer / force unwrap (usually 30-40% of crashes): Fix with better null handling, optional chaining, and defensive coding.
- ANR / UI thread blocking (common on Android): Identify the blocking operation and move it off the main thread.
- Out of memory (especially image-heavy apps): Implement proper image downsampling and cache eviction.
- Concurrency issues (race conditions, thread safety): Use thread-safe data structures or serialize access.
- Platform-specific (device fragmentation on Android): Test on affected devices, add device-specific workarounds.
- Fix in priority order. Fix the top 3 crashes first. Each fix should measurably improve the crash-free rate. Track the improvement after each release.
- Prevent regression. Add crash-scenario-specific tests. If a crash was caused by a null user object after process death, add a test that simulates process death and verifies the screen handles a null user gracefully.
-
Proactive measures:
- Enable strict null safety (Kotlin’s built-in null safety, Swift’s optionals).
- Add global error boundaries (try-catch at the ViewModel level, global exception handler for uncaught exceptions on Android).
- Use feature flags to disable problematic features remotely while a fix ships.
- Set a quality bar. Block releases if the crash-free rate drops below 99.5% in staged rollout. Stop the rollout at 5% until the regression is fixed.”
- “I would fix all the bugs.” — No prioritization framework. At 97% crash-free rate, there could be hundreds of distinct crashes. You cannot fix them all at once.
- “We should add more try-catch blocks.” — Suppressing exceptions hides bugs. The app stops crashing but starts behaving incorrectly, which is worse.
- “We need better QA testing.” — Testing catches bugs before release, but the question is about production crashes. QA and production reliability are complementary, not substitutes.
- “I would sort crashes by unique affected users, not occurrence count. One crash affecting 10,000 users is a higher priority than a crash that occurs 10,000 times for the same 50 users.”
- “The path from 97% to 99.5% is about fixing the top 5-10 crashes. The path from 99.5% to 99.9% is about systemic prevention: strict null safety, global error boundaries, StrictMode enforcement, and automated process death testing in CI.”
- “I would set a release quality gate: if crash-free rate drops below 99% during 1% staged rollout, the rollout halts automatically and the release manager is paged.”
- Failure mode: “A fix for the #1 crash introduces a regression that creates a new #1 crash. The crash-free rate stays at 97% but the affected user population shifts. Fix: compare crash-free rate AND top crash hashes between releases. A new crash hash appearing in the top 5 after a release is a regression, even if the overall rate did not change.”
- Rollout: “Ship crash fixes behind feature flags when possible. If the fix changes a code path that many users exercise, validate it at 1% rollout before expanding.”
- Measurement: “Track crash-free rate as a time series, segmented by app version, OS version, and device model. Set alerts for: overall rate drops below 99%, any single crash affecting >0.1% of users, and any new crash appearing in the top 10.”
- Cost: “At 1M DAU with 97% crash-free rate, 30,000 users crash daily. If 1% of those leave a 1-star review, that is 300 negative reviews per day. The App Store rating impact is measurable — every 0.1-star drop in rating reduces install conversion by approximately 5%.”
Q5: How do you manage state across configuration changes on Android?
Q5: How do you manage state across configuration changes on Android?
- ViewModel (survives config changes, NOT process death). Store UI state and in-progress operations here. The ViewModel lives in the ViewModelStore, which is retained across Activity recreation. For most state, this is sufficient.
- SavedStateHandle (survives config changes AND process death). For state that must survive process death (form inputs, selected tab, scroll position), use SavedStateHandle inside the ViewModel. Under the hood, it uses the Activity’s onSaveInstanceState Bundle, which the OS persists.
- Persistent storage (survives everything). Room database, SharedPreferences, or DataStore for data that must survive app uninstall or that is too large for SavedStateHandle (the Bundle has a ~500KB limit).
- Transient UI state (isLoading, error message) → ViewModel property
- Recoverable UI state (scroll position, form text, selected tab) → SavedStateHandle
- User data (settings, cached content, credentials) → Room / DataStore / EncryptedSharedPreferences
adb shell am kill com.myapp. Navigate to a deep screen, background the app, run the kill command, then reopen the app from recents. If the app crashes or shows a blank screen, you have a process death handling bug.”Common mistakes:- Storing everything in the ViewModel and ignoring process death entirely
- Using
onSaveInstanceStatefor large data (it has a 500KB limit; exceeding it throws a TransactionTooLargeException) - Not knowing that the ViewModel dies with the process
Q6: Compare RecyclerView and LazyColumn (Jetpack Compose) for list performance
Q6: Compare RecyclerView and LazyColumn (Jetpack Compose) for list performance
- Explicitly recycles ViewHolder objects. You create a ViewHolder, bind data to it, and the RecyclerView recycles it when it scrolls off-screen.
DiffUtilcalculates the minimum set of changes between two lists, enabling efficient partial updates.ItemAnimatorprovides built-in insert/remove/move animations.- Mature, battle-tested, highly optimizable. Supports complex layouts (grid, staggered grid, horizontal/vertical).
- Downside: verbose boilerplate (Adapter, ViewHolder, layout XML).
- No explicit recycling or ViewHolders. You declare items in a lambda, and Compose manages composition/disposal.
keyparameter serves the same purpose as stable IDs in RecyclerView — tell Compose how to identify items for recomposition.- Simpler API, less boilerplate: a LazyColumn with items is 10 lines of code vs 50+ for RecyclerView.
- Performance is good but not yet equal to a well-optimized RecyclerView. Compose’s recomposition overhead means that for extremely complex items or very fast scrolling, RecyclerView can still win.
- New Compose-first project: LazyColumn is the default. The productivity gain from less boilerplate outweighs the marginal performance difference for most apps.
- Existing View-based project with performance-critical lists (social media feed with video, financial ticker): Keep RecyclerView and migrate other screens to Compose. The RecyclerView in a Compose hierarchy works fine via
AndroidViewinterop. - Maximum performance: RecyclerView with
RecycledViewPoolshared across multiple RecyclerViews, prefetch enabled, andsetHasStableIds(true).
@Stable or @Immutable annotations on data classes, provide stable key values, and avoid creating new objects inside the items lambda.”Words that impress: “ViewHolder recycling pool,” “DiffUtil on a background thread,” “recomposition stability,” “LazyListState for scroll restoration”Q7: How does certificate pinning work on mobile and what are the operational risks?
Q7: How does certificate pinning work on mobile and what are the operational risks?
- Network Security Config (XML-based, recommended): Declare pins in
network_security_config.xml. The OS enforces them automatically. - OkHttp
CertificatePinner: Programmatic pinning at the HTTP client level. - Pin the SPKI (Subject Public Key Info) hash, not the certificate. Public keys survive certificate rotation.
URLSessiondelegate methodurlSession:didReceiveChallenge:to validate the server certificate against pinned values.- TrustKit (open-source library) provides declarative pinning with reporting.
- Certificate rotation breaks the app. If your certificate expires and your app has the old pin, all network requests fail. The app is completely broken and users cannot even reach your server to get an update. This has happened to major apps.
- Mitigation: Pin multiple public keys — your current key and at least one backup key for a certificate you have issued but not deployed. Include an expiration date for pins, after which pinning is disabled (a safety net).
- Emergency kill switch: A feature flag evaluated before pinning is applied. If you need to disable pinning in an emergency, the flag endpoint must itself be unpinned (or use a separate, unpinned domain).
- Pin for: authentication endpoints, payment endpoints, highly sensitive data.
- Consider skipping for: public content endpoints, CDN-served images (CDN certificate rotation is frequent and outside your control).
- Pinning the leaf certificate instead of the public key (breaks on every certificate rotation)
- Not including backup pins
- Pinning CDN endpoints (CDN providers rotate certificates frequently)
- No kill switch for disabling pinning in emergencies
Q8: Design the data layer for a mobile app that works offline and syncs with a server
Q8: Design the data layer for a mobile app that works offline and syncs with a server
-
Local database (Room on Android, Core Data on iOS). This is the source of truth. Every entity has:
localId(UUID, client-generated),serverId(nullable, server-assigned after first sync),lastModifiedLocal(timestamp of last local change),syncStatus(enum: synced, pendingCreate, pendingUpdate, pendingDelete). - Repository pattern. The repository exposes data as observable streams (Flow on Android, Combine on iOS). The UI observes these streams and re-renders on changes. The repository decides whether to read from local DB, refresh from network, or both.
-
Sync engine. A dedicated component that:
- Processes the sync queue (entities with syncStatus != synced) in order.
- Uses a
lastSyncTimestampto request only server changes since the last sync. - Maps local IDs to server IDs after first sync.
- Handles conflict resolution (LWW, merge, or user-prompted).
- Runs on WorkManager (Android) / BGTaskScheduler (iOS) for background sync.
- Network layer. Retrofit/Ktor (Android), URLSession/Alamofire (iOS). Handles authentication, retry logic, and error mapping.
- Sync fails midway: the sync engine must be idempotent. Retry from the beginning and the server deduplicates by localId.
- Entity referenced by another entity is not yet synced: process creates before updates, and maintain referential integrity in the sync queue order.
- The server rejects a change (validation error): mark the entity as
syncFailedwith an error message, notify the user.”
- Failure mode: “The ID mapping table is corrupted after a failed sync — a local ID is mapped to a wrong server ID. All subsequent operations on that entity go to the wrong server record. Fix: wrap the entire sync-and-map operation in a database transaction. If any part fails, roll back the entire mapping.”
- Rollout: “Ship the sync engine in phases: Phase 1 (read-only sync — pull server data to local DB), Phase 2 (write sync — push local changes to server), Phase 3 (bidirectional delta sync with conflict resolution). Each phase behind a separate flag.”
- Measurement: “Sync success rate, sync latency (P50/P95), conflict rate, ID mapping consistency (periodic audit comparing local and server state), and data integrity score (checksum comparison of entity fields between client and server after sync).”
Q9: How would you reduce the battery impact of a location-tracking feature?
Q9: How would you reduce the battery impact of a location-tracking feature?
- Significant location changes (iOS) / passive location (Android). The OS notifies your app when the user moves by approximately 500 meters. Extremely battery-efficient — piggybacks on location updates the OS is already computing for other apps. Good for: check-in apps, weather apps, regional content.
- Geofencing. Define circular regions. The OS notifies your app when the user enters or exits. Uses cell tower/WiFi triangulation (not GPS) for power efficiency. Good for: store proximity alerts, home/office detection.
- Balanced accuracy at reduced frequency. Request location updates every 30-60 seconds with ‘balanced’ accuracy (cell tower + WiFi, not GPS). Good for: fitness tracking when not exercising, food delivery driver tracking between deliveries.
- High accuracy at high frequency. GPS at 1-second intervals. Necessary only for active navigation, running/cycling tracking, and ride-sharing during an active trip. Drain rate: ~10-15% battery per hour.
- When the user is not in an active session (no ride, not exercising): use significant location changes or geofencing.
- When the user starts an active session (starts a ride, begins a run): switch to high-accuracy, high-frequency updates.
- When the session ends: immediately switch back to low-power mode.
- When the app backgrounds during an active session: reduce frequency to every 5-10 seconds (still high accuracy, but less frequent).
LocationRequest.create().setPriority() with PRIORITY_HIGH_ACCURACY, PRIORITY_BALANCED_POWER_ACCURACY, or PRIORITY_LOW_POWER depending on the use case.iOS-specific: Use CLLocationManager with appropriate desiredAccuracy (.bestForNavigation, .nearestTenMeters, .threeKilometers) and distanceFilter (minimum distance in meters before the next update).The test that catches battery issues: Run the app in the foreground and background for 1 hour with Instruments (iOS) or Battery Historian (Android). Compare battery drain against a baseline with location tracking disabled. If the delta is more than 5% per hour when the user is not in an active session, your low-power mode is not working correctly.”Common mistakes:- Using
PRIORITY_HIGH_ACCURACYeverywhere, even when coarse location is sufficient - Not switching to low-power mode when the app backgrounds or the session ends
- Forgetting to stop location updates when the feature is not in use (location manager leak)
Q10: What are the key differences between Jetpack Compose and SwiftUI?
Q10: What are the key differences between Jetpack Compose and SwiftUI?
- Compose:
remember,mutableStateOf,State<T>. State is held in the composition and recomposition is triggered by state changes.derivedStateOffor computed state that should not trigger unnecessary recompositions. - SwiftUI:
@State,@Binding,@Observable(iOS 17+),@StateObject/@ObservedObject(older). SwiftUI’s property wrappers are elegant but the proliferation of different wrappers (@State,@StateObject,@ObservedObject,@EnvironmentObject,@Observable) confuses newcomers.
- Compose: Modifier chains (
Modifier.padding().fillMaxWidth().background()). Modifiers are ordered —paddingbeforebackgroundgives a different result thanbackgroundbeforepadding. This catches people. - SwiftUI: Similar modifier chains, but SwiftUI uses a
ViewBuilderDSL that feels more natural to Swift developers.
- Compose: Jetpack Navigation Compose —
NavHostwith route strings. Feels bolted on, and type-safe navigation (introduced in 2024) is still maturing. - SwiftUI:
NavigationStackwithNavigationLinkandnavigationDestination. Cleaner API after iOS 16’s navigation overhaul.
- Compose: Can host Android Views inside
AndroidViewand can be hosted inside XML layouts. Bidirectional interop is excellent. Compose has stabilized faster because Google controls both the framework and the platform. - SwiftUI: Can host UIKit views inside
UIViewRepresentableand can be hosted inside UIKit viaUIHostingController. Some UIKit components still lack SwiftUI equivalents (e.g., complex UICollectionView layouts). Apple’s annual iOS releases sometimes break SwiftUI behavior.
- Saying they are “basically the same” without acknowledging state management differences
- Not knowing about Compose’s modifier ordering sensitivity
- Not mentioning interop with the legacy frameworks
Q11: A user reports that your app uses 2GB of cellular data per month. How do you investigate and reduce it?
Q11: A user reports that your app uses 2GB of cellular data per month. How do you investigate and reduce it?
- Instrument network usage. Use Charles Proxy or Proxyman to capture all network traffic from the app for a typical 1-hour usage session. Group requests by endpoint, measure total bytes transferred.
-
Identify the top offenders. Common culprits:
- Images downloaded at full resolution. If the app is fetching 4000x3000 photos to display in 200x200 thumbnails, each image is 40x larger than necessary. Fix: request images at display size from the CDN.
- Video pre-loading. Auto-playing videos that buffer full quality on cellular. Fix: reduce quality on cellular, only buffer the first 5 seconds, do not pre-buffer off-screen videos.
- Polling. An API call every 5 seconds for real-time data. 12 calls/minute x 60 minutes/hour x 16 hours/day = 11,520 calls/day. If each response is 10KB, that is 115MB/day, 3.5GB/month. Fix: switch to WebSocket for real-time data, or increase polling interval to 30-60 seconds.
- Redundant fetches. The same data fetched multiple times without caching. Fix: implement proper cache headers (ETag, Cache-Control) and a local cache.
- Analytics/telemetry payloads. If analytics events are not batched, each event is a separate request with full HTTP headers. Fix: batch events and send every 30 seconds or on app background.
- Apply compression. Ensure gzip/Brotli is enabled on all API responses. Check that the CDN serves compressed images (WebP/AVIF).
-
Add a data-saver mode. Give users control: reduce image quality, disable auto-play videos, increase sync intervals. Android has a system-level data saver setting — respect it via
ConnectivityManager.isActiveNetworkMetered(). - Monitor going forward. Track per-session data usage as a metric in your analytics. Alert if average data usage per session increases by more than 20% between releases.”
Q12: How do push notifications work end-to-end, and why can't you guarantee delivery?
Q12: How do push notifications work end-to-end, and why can't you guarantee delivery?
- At app install, the device registers with APNs/FCM and receives a device token.
- The app sends this device token to your backend, which stores it.
- When your backend wants to notify the user, it sends a payload to APNs/FCM with the device token.
- APNs/FCM delivers the notification to the device.
- The device OS displays the notification (or wakes the app for a silent push).
- Stale tokens. The user uninstalled the app or the token rotated. APNs returns an error for invalid tokens (your backend should clean them up), but FCM may not.
- Device is offline. APNs stores the most recent notification per app-id and delivers it when the device reconnects. FCM stores up to 100 messages for up to 4 weeks. But only the latest per collapsible key is stored, and messages expire.
- Battery optimization. Doze mode (Android) batches notifications into maintenance windows. A notification sent at 2 AM may not be delivered until 7 AM when the user picks up their phone.
- User-disabled notifications. On iOS, 30-40% of users decline notification permission. On Android, users can disable notifications per-channel or globally.
- Throttling. APNs and FCM throttle high-volume senders. If you send too many notifications to a single device, they will be dropped.
- Priority. FCM distinguishes between high-priority (delivered immediately, even in doze mode) and normal-priority (batched). APNs has similar priorities. Use high priority only for time-sensitive notifications (messages, alarms), not marketing.
- Assuming push delivery is reliable (it is 60-90%, not 100%)
- Using high priority for all notifications (gets throttled)
- Not handling stale device tokens (sending to uninstalled apps wastes quota)
- Failure mode: “A marketing team sends 10 million push notifications simultaneously. APNs/FCM throttle the burst, and 30% of notifications are delayed by 2-4 hours. By the time they arrive, the flash sale they promoted has ended. Fix: implement server-side send pacing (spread notifications over 15-30 minutes) and use high priority only for time-sensitive notifications.”
- Rollout: “New notification types (rich notifications with images, interactive buttons, notification grouping) should be shipped behind a feature flag and A/B tested. Measure: open rate, dismissal rate, and unsubscribe rate per notification type.”
- Measurement: “Track: delivery rate (sent vs received, estimated via silent push acknowledgment), open rate (tapped vs delivered), opt-out rate (users disabling notifications), and ‘notification fatigue index’ (declining open rates over time indicating over-notification).”
- Security/Governance: “Push notification payloads should never contain sensitive data (account balances, medical information, personal messages). The notification content should be a hint (‘You have a new message’) and the actual content should be fetched from the server when the user opens the app. Reason: lock screen notifications are visible without device authentication.”
Q13: How would you implement a feature flag system for mobile?
Q13: How would you implement a feature flag system for mobile?
- Flag evaluation SDK. A client-side SDK (LaunchDarkly, Firebase Remote Config, or custom) that evaluates flags locally based on cached rules. No network call on every evaluation — flags are fetched once (on app launch and periodically in the background) and cached.
- Cache and fallback. On first install, use hardcoded defaults (all risky features disabled). On subsequent launches, use cached values until a fresh fetch completes. This ensures flags work offline.
- Evaluation context. Each flag evaluation includes context: user ID, app version, device model, OS version, geography. This enables targeting rules like: ‘Enable for 10% of users on version >= 3.5.0 in the US on Android.’
- Kill switches. Every risky feature has a flag that defaults to OFF. Enable gradually: 1% → 5% → 20% → 50% → 100%. If crash-free rate drops at any stage, disable the flag remotely.
- Stale flag handling. Mobile apps can run with cached flags for weeks if the user does not foreground the app. Set a maximum cache TTL (e.g., 24 hours). On foreground, refresh flags. If the cache is expired and network is unavailable, use safe defaults (features off).
- Flag cleanup. Feature flags accumulate. Old flags that are 100% enabled become dead code that everyone is afraid to remove. Schedule quarterly flag cleanup sprints. After a flag has been 100% enabled for 2+ release cycles with no issues, remove the flag and hard-code the behavior.
- Version targeting is critical. A flag must know the client version. ‘Enable new checkout’ for v3.5+ but not v3.4 (which has a known incompatibility).
- Offline evaluation is mandatory. The flag system must work without a network connection.
- Rollout is slower. It takes 2-4 weeks for most users to update to a new version. Your flag must support a mixed population of old and new versions simultaneously.”
- Failure mode: “A feature flag is evaluated before the flag SDK has fetched fresh values. The app uses a stale cached value and enables a broken feature that the team disabled 6 hours ago. Fix: implement a ‘flag freshness’ check — if cached values are older than the maximum TTL and no network is available, use conservative defaults (features off) rather than stale values.”
- Rollout: “Ship the feature flag SDK itself behind a phased rollout. Use a simple server-side version gate first (hardcoded minimum version), then migrate to the full flag system. This avoids the chicken-and-egg problem of ‘how do you flag-gate the flag system.’”
- Measurement: “Track: flag evaluation latency (should be <1ms for cached evaluation), flag fetch success rate, stale flag rate (percentage of evaluations using cached values older than TTL), and ‘flag debt’ (number of flags that have been 100% enabled for >2 release cycles and should be cleaned up).”
- Security/Governance: “Feature flags can be a security vector: if an attacker can manipulate flag values (by intercepting the flag fetch response), they can enable hidden features or disable security controls. Pin the flag service endpoint, validate response signatures, and never gate security features (biometric requirements, certificate pinning) behind remotely-controlled flags that could be disabled by an attacker.”
Q14: Explain the Android Jetpack Navigation component and how it handles deep linking
Q14: Explain the Android Jetpack Navigation component and how it handles deep linking
Q15: How does Kotlin Multiplatform (KMP) share code between iOS and Android?
Q15: How does Kotlin Multiplatform (KMP) share code between iOS and Android?
Q16: Your app's ANR rate is above 1%. How do you diagnose and fix it?
Q16: Your app's ANR rate is above 1%. How do you diagnose and fix it?
- Get the ANR traces. Google Play Console > Android Vitals > ANRs shows grouped ANR clusters with stack traces. The stack trace shows what the main thread was doing when the ANR triggered.
-
Common root causes:
- Synchronous network call on main thread. Still happens, even though StrictMode should catch it. Fix: use coroutines with Dispatchers.IO.
- Synchronous database query on main thread. Room throws an exception by default, but some teams disable this check. Fix: never disable
allowMainThreadQueries(). - Lock contention. Main thread waiting for a lock held by a background thread. Fix: use lock-free data structures or reduce critical section size.
- Heavy computation. JSON parsing a large response, image decoding, complex layout calculation. Fix: move to background thread.
- ContentProvider query. Even system ContentProviders can be slow. The main thread queries a ContentProvider that is backed by a slow disk operation. Fix: query on a background thread.
- SharedPreferences.apply().
apply()writes to disk asynchronously, but the write is guaranteed to complete before the Activity’sonStop(). If there are many pendingapply()calls,onStop()blocks until they all complete. Fix: use DataStore (Jetpack) which handles this correctly, or batch SharedPreferences writes.
- Reproduce and profile. Enable StrictMode in debug builds to catch main-thread disk/network operations:
-
Fix systematically. Do not just fix the top ANR — audit all main-thread operations. Create a lint rule or architectural guideline: no I/O on the main thread, ever. Use
Dispatchers.IOfor all disk and network operations,Dispatchers.Defaultfor CPU-intensive computation. - Monitor. Track ANR rate per release. Block releases if ANR rate exceeds 0.5% in staged rollout.”
- “ANR means the app is slow. We should optimize the code.” — Too vague. Does not identify the main thread as the bottleneck or distinguish between CPU-bound and I/O-bound blocking.
- “We should increase the ANR timeout.” — You cannot. The 5-second timeout is OS-enforced and not configurable. This reveals a fundamental misunderstanding.
- “Just move everything to background threads.” — Not everything can run on background threads. UI operations must be on the main thread. The skill is knowing what to move off the main thread, not blindly moving everything.
- “I would start with the ANR traces from Google Play Console. The stack trace shows exactly what the main thread was doing when the ANR triggered. In my experience, 70% of ANRs are caused by synchronous I/O (disk or network) on the main thread.”
- “The sneaky ANR cause is
SharedPreferences.apply(). It is marketed as async, but the pending writes blockonStop(). A screen with 20apply()calls in quick succession can ANR during Activity transitions. DataStore fixes this.” - “I would add StrictMode to debug builds as a preventive measure, then add a custom lint rule that flags
@MainThreadfunctions calling any I/O API. Prevention is cheaper than production debugging.”
- Failure mode: “A team migrates from SharedPreferences to DataStore but keeps the old SharedPreferences for backward compatibility. Both systems write to disk, and DataStore’s migration reads the old SharedPreferences file on the main thread during first access. ANR rate spikes on app update. Fix: migrate SharedPreferences to DataStore asynchronously during app startup, not on first DataStore access.”
- Measurement: “Track ANR rate per screen, per release, per device tier. Google Play Console provides this but with a 48-hour delay. For faster feedback, instrument your app with a main-thread watchdog that logs when the main thread is blocked for >2 seconds (well before the 5-second ANR threshold).”
- Cost: “Google Play’s algorithm penalizes apps with ANR rate above 0.47%. This means reduced visibility in search results and recommendations. For an app with 1M monthly installs, a 1% ANR rate could reduce new installs by 5-10% due to lower Play Store ranking.”
Q17: Compare the architectures of React Native, Flutter, and KMP at a technical level
Q17: Compare the architectures of React Native, Flutter, and KMP at a technical level
- Language: JavaScript/TypeScript
- UI rendering: Translates React components to native platform views (UIView on iOS, android.view.View on Android). Your
<Text>becomes a UILabel or TextView. - Communication: Previously async JSON Bridge, now JSI (synchronous C++ bindings). Fabric for concurrent rendering.
- Runtime: JavaScript engine (Hermes, optimized for React Native) running in a separate thread.
- Trade-off: Access to native views means your app looks and feels native. But the JS ↔ native boundary still exists, and complex interactions that cross it frequently (gestures driving native animations) can be a pain point.
- Language: Dart
- UI rendering: Does NOT use native views. Flutter renders every pixel itself using Skia/Impeller graphics engine on a raw canvas surface. A Flutter button is a Flutter-drawn button, not a UIButton or MaterialButton.
- Communication: No bridge needed for UI. Platform channels (async message passing) for native API access (camera, Bluetooth).
- Runtime: Dart compiles to native ARM code (AOT). No interpreter, no JIT in production. Performance is close to native for CPU-bound work.
- Trade-off: Pixel-perfect consistency across platforms (same renderer = same output). But no native UI components means the app does not automatically inherit platform-specific behaviors (iOS scroll physics, Android ripple effects). Flutter approximates them, but users on one platform may notice.
- Language: Kotlin
- UI rendering: Does NOT share UI. The UI layer is fully native: Jetpack Compose on Android, SwiftUI on iOS.
- Communication: No bridge for shared code. Shared Kotlin code compiles to JVM bytecode (Android) or native ARM via LLVM (iOS). It is native code on both platforms.
- Shared layer: Business logic, networking, data models, local storage. Not the UI.
- Trade-off: Maximum UI fidelity (it IS native UI) and maximum logic sharing. But the iOS team must consume Kotlin-generated Objective-C frameworks, which has ergonomic friction.
| Aspect | React Native | Flutter | KMP |
|---|---|---|---|
| UI approach | Native views via bridge/JSI | Custom rendering engine | Fully native UI (not shared) |
| Shared code | 70-90% (UI + logic) | 90-95% (UI + logic) | 30-60% (logic only) |
| Performance | Good (New Arch), not native | Very good (AOT Dart) | Native (compiled Kotlin) |
| Platform fidelity | High (native views) | Medium (custom rendering) | Highest (native UI) |
| Team skill | React/JS developers | Dart developers | Kotlin developers |
| OTA updates | Yes (CodePush) | Limited (Shorebird, early) | No |
| Maturity | High (2015) | High (2018) | Growing (2023 stable) |
Q18: How do you handle backward compatibility when shipping mobile API changes?
Q18: How do you handle backward compatibility when shipping mobile API changes?
-
API versioning. Use URL-based versioning (
/v1/users,/v2/users) or header-based versioning. Support at least N-2 versions (current + two previous majors). Deprecate old versions with clear timelines and in-app messaging. - Additive-only changes. Adding a new field to a JSON response is backward-compatible — old clients ignore it. Removing or renaming a field breaks old clients. Rule: never remove or rename a field in an existing API version. Add new fields, deprecate old ones, and remove them only in the next major version.
-
Feature flags for API behavior. Instead of versioning the entire API, use server-side feature flags keyed to the client version.
X-Client-Version: 3.5.0header allows the server to tailor responses. - Forced upgrade. As a last resort, if an API version has a critical security vulnerability or the maintenance cost is unsustainable, implement a forced upgrade: the server returns a specific error code (e.g., HTTP 426 Upgrade Required), and the client shows a modal directing the user to update.
- Graceful degradation on the client. The client should handle unknown fields gracefully (ignore them), handle missing optional fields (use defaults), and handle new enum values (treat unknown values as a default case). Use lenient JSON parsing:
- Your server has a
minimum_supported_versionconfig. - On every API call, the client sends its version in a header.
- If the client version is below the minimum, the server responds with an upgrade-required payload.
- The client shows a blocking UI: ‘Please update the app to continue.’
- Use this sparingly — forcing upgrades frustrates users and increases uninstall rates.”
Q19: Describe how you would implement end-to-end encryption in a mobile messaging app
Q19: Describe how you would implement end-to-end encryption in a mobile messaging app
- Key generation. Each device generates a long-lived identity key pair and a set of ephemeral pre-keys (one-time-use public keys). The public parts are uploaded to the server.
-
Key exchange (X3DH). When Alice wants to message Bob for the first time:
- Alice fetches Bob’s identity key and a pre-key from the server.
- Alice performs X3DH (Extended Triple Diffie-Hellman) to derive a shared secret without Bob being online.
- Alice sends the initial message encrypted with the shared secret, along with her ephemeral public key.
- Bob decrypts using his private keys and derives the same shared secret.
- Double Ratchet. After the initial key exchange, every message advances the key through a ratchet mechanism. Each message is encrypted with a new key derived from the previous key. This provides forward secrecy — compromising one message key does not compromise past or future messages.
- Key storage on device. Private keys are stored in the Keychain (iOS) or Keystore (Android) — hardware-backed, never exported, never sent to the server. The encryption/decryption happens entirely on-device.
- Key verification. Users can verify each other’s identity keys by comparing ‘safety numbers’ (a visual hash of both parties’ identity keys). If a user’s identity key changes (new device, reinstall), the app warns the other party: ‘Safety number changed.’
- Multi-device. When a user has multiple devices (phone + tablet), each device has its own identity key. A message is encrypted separately for each of the recipient’s devices — a message to a user with 3 devices is actually 3 encrypted payloads.
- Group messaging. Signal Protocol uses Sender Keys for groups: the sender distributes a sender key to all group members, then encrypts each message once with the sender key. This is more efficient than encrypting N times for N members.
- Key backup. If the user loses their device, they lose their private keys and all message history. Some apps (WhatsApp) offer encrypted cloud backups. The backup encryption key must be user-controlled (a PIN or passphrase), not server-stored.
- Performance. Encrypting and decrypting thousands of messages requires efficient crypto libraries. Use platform-native crypto (CommonCrypto on iOS, Tink or BouncyCastle on Android) rather than JavaScript-based crypto.
- Offline key exchange. X3DH allows the first message to be sent even if the recipient is offline, using pre-uploaded pre-keys. The server must manage the pre-key supply and alert the client to upload more when running low.”
- Failure mode: “A user reinstalls the app and generates a new identity key pair. The old key pair is lost. All previously encrypted messages are unreadable because the decryption keys are gone. This is by design (forward secrecy), but users perceive it as data loss. Fix: offer optional encrypted cloud backup of the key material, protected by a user-chosen passphrase.”
- Rollout: “E2EE is an irreversible change — once messages are encrypted, you cannot un-encrypt them without losing the content. Ship in phases: Phase 1 is key generation and exchange (no encryption yet). Phase 2 is encrypt new messages only. Phase 3 is full E2EE with encrypted media and group messaging.”
- Measurement: “Track: encryption/decryption latency per message (should be <5ms with hardware-accelerated AES), key exchange success rate, pre-key replenishment rate, and ‘safety number verification rate’ (what percentage of users verify their contacts’ identity keys).”
- Security/Governance: “E2EE conflicts with legal requirements in some jurisdictions (lawful intercept, content moderation). For enterprise or government customers, consider a ‘compliance key escrow’ mode where a third key is generated and held by an administrator. This weakens E2EE but satisfies regulatory requirements. Document this trade-off explicitly.”
Q20: How would you architect a mobile app for a team of 50 engineers?
Q20: How would you architect a mobile app for a team of 50 engineers?
-
Feature modules. Each team owns one or more feature modules (
:feed,:checkout,:profile,:messaging). Each module contains its own UI, ViewModel, repository, and tests. Modules depend on shared libraries but not on each other. -
Shared libraries. Cross-cutting concerns in shared modules:
:core-network,:core-design-system,:core-auth,:core-analytics. These are owned by a platform team and have strict API stability requirements. -
App shell. A thin app module (
:app) that depends on all feature modules, handles navigation between features, and manages the app lifecycle. The app module should contain minimal code — it is a composition point, not a feature. - Dependency rules: Feature modules can depend on shared libraries. Feature modules CANNOT depend on other feature modules. Communication between features goes through a navigation API or event bus, not direct imports. This rule is enforced by the build system and CI.
- Use Gradle with build caching and incremental compilation. At 50 engineers, full builds can take 15-30 minutes without optimization.
- Module-level caching: if a team only changed
:checkout, only:checkoutand:apprebuild. Other modules use cached artifacts. - Remote build cache (Gradle Enterprise/Develocity, or custom): share build cache across the entire team. A change built by one engineer does not need to be rebuilt by others.
- Consider Bazel for very large codebases (100+ modules). Bazel’s fine-grained caching and remote execution are more powerful than Gradle’s, but the migration cost is significant.
- CODEOWNERS file enforcing review requirements per module.
- Each feature module has a designated owning team. Pull requests to that module require approval from the owning team.
- The platform team owns shared libraries and reviews all changes to them.
- Each module has its own unit and integration tests.
- Module tests run in CI on every PR to that module.
- Full app E2E tests run nightly or on release candidates.
- Snapshot tests for the design system ensure visual consistency.
- Feature flags gate all new features. Merge to main does not mean the feature is live.
- Release train: weekly release cut from main. Feature flags control what is enabled.
- Each team enables their features via flags after the release ships, on their own schedule.
- This decouples ‘merge to main’ from ‘ship to users,’ allowing 50 engineers to merge without blocking each other.
Follow-Up Question Handling
Mobile interviews often go deep into areas where your experience may be thinner. Here is how to handle that gracefully.Buying Time Gracefully
- “That is a great question. Let me think through the layers involved.” — Then enumerate: data layer, network layer, UI layer, platform constraints. This gives you 15-20 seconds to organize your thoughts while sounding structured.
- “I have not implemented that exact pattern, but let me reason through it from first principles.” — Then start with what you know. If asked about Flutter’s rendering pipeline and you have not used Flutter, reason from what you know about graphics rendering, GPU composition, and Skia (which also powers Chrome).
- “Let me break that into the parts I know well and the parts I would need to research.” — This signals intellectual honesty and self-awareness, which interviewers value more than faking expertise.
- “In my experience on [Android/iOS], the equivalent is [X]. I would expect [Flutter/React Native/the other platform] to have a similar mechanism because the underlying constraint — [battery/memory/network] — is the same.” — This bridges from your platform expertise to the unknown platform.
Redirecting to Strength
If asked about a platform you do not know deeply:- “I have not worked with [X platform] directly, but I have solved the same underlying problem on [Y platform] using [Z approach]. The platform APIs differ, but the architectural pattern is the same because the constraint — [memory pressure / battery drain / network reliability] — is universal.”
- Frame answers in terms of constraints and patterns, not platform-specific APIs. The interviewer cares more about your problem-solving approach than your memorization of API names.
Admitting Gaps with Confidence
- “I have not used [X] in production, but here is how I would evaluate it…” — Then discuss trade-offs, when you would use it vs alternatives, and what you would investigate before adopting it.
- “My experience is deeper on the [iOS/Android] side. On [the other platform], I understand the concept is [X], but I would not claim hands-on expertise with the specific APIs.” — Honest, specific, and shows you know enough to know what you do not know.
- “I do not know the answer to that specific implementation detail, but here is how I would find out in production: [check the documentation / profile with Instruments / set up an A/B test].” — Shows engineering maturity.
Professional Best Practices Checklist
Before (Planning and Setup)
- Define target platforms, minimum OS versions, and supported device matrix before writing code
- Choose architecture pattern based on team size and app complexity (see Section 1)
- Set up CI/CD with automated builds for every PR (Fastlane, GitHub Actions, Bitrise)
- Set up crash reporting (Crashlytics/Sentry) and analytics before the first beta build
- Establish a feature flag system before shipping the first feature (Firebase Remote Config at minimum)
- Define performance budgets: cold start < 1.5s, scroll 60fps, crash-free > 99.5%
- Set up certificate pinning configuration with backup pins
During (Development)
- Test process death on every screen (
adb shell am kill, iOS background termination) - Test on real devices, not just emulators, for performance-sensitive features
- Test with network conditions: slow 3G, airplane mode, WiFi-to-cellular transition
- Profile memory usage during long sessions (30+ minutes of use)
- Run LeakCanary (Android) / Instruments Leaks (iOS) before every release
- Use
DiffUtil/NSDiffableDataSourceSnapshotfor all list updates - Decode images at display size, never at source resolution
- Move all I/O off the main thread (enforce with StrictMode on Android)
- Implement offline behavior for every data-dependent screen (at minimum: show cached data with “offline” indicator)
- Use idempotency keys for all mutating API calls
After (Release and Monitoring)
- Use staged rollout (1% → 5% → 20% → 100%) for every release
- Monitor crash-free rate within 2 hours of rollout start; halt if it drops below 99%
- Monitor ANR rate (Android): halt rollout if above 0.5%
- Track cold start time per release — alert if it regresses by more than 200ms
- Clean up stale feature flags quarterly
- Update minimum supported OS version annually (drop versions below 5% adoption)
- Audit app permissions annually — remove permissions you no longer use
When Things Go Wrong
- Critical crash in production: Immediately check if the crash is behind a feature flag. If yes, disable the flag. If no, submit a hot-fix and request expedited App Store review.
- Certificate pinning lockout: If you pinned the wrong certificate and the app cannot connect, you need a new app version without the bad pin — and no way to distribute it through the app (because the app cannot connect to the server). Mitigation: always include a kill switch on an unpinned endpoint, or use short pin expiration with fallback to standard validation.
- Backend API breaking change: If the backend ships a breaking change and old mobile clients are affected, the mobile team cannot ship a fix faster than App Store review allows. Mitigation: the backend must support old API versions until the mobile team can ship and verify a fix.
- Data loss from sync conflict: Surface a recovery UI showing both versions. Never silently discard user data.
Above and Beyond
Advanced Techniques
- Baseline Profiles (Android). Pre-compile hot code paths during the APK build. The Baseline Profile tells ART which methods to compile ahead of time, reducing JIT compilation stutters on first launch. Google reports 15-30% startup time improvement. This is a low-effort, high-impact optimization that most teams overlook.
- Metal/Vulkan for custom rendering. For apps with heavy custom rendering (maps, data visualization, games), bypass the platform UI framework and render directly with Metal (iOS) or Vulkan (Android). This gives you full GPU control and can handle complex scenes that would overwhelm UIKit/Android Views.
-
Shared Element Transitions. Animate a UI element (like a thumbnail) from one screen to another (the detail view) to create a fluid, connected navigation experience. MaterialSharedAxis and Shared Element transitions in Jetpack Navigation, or
matchedGeometryEffectin SwiftUI. -
Predictive Back Gesture (Android 14+). The system shows a preview of the previous screen before the user commits to going back. Apps must adopt the new back API (
OnBackInvokedCallback) to support this. A small change that significantly improves perceived performance. - App Clips (iOS) / Instant Apps (Android). Lightweight versions of your app that users can use without installing. App Clips are < 15MB and launched from QR codes, NFC tags, or Safari Smart Banners. Instant Apps are loaded from the Play Store on demand. Both are excellent for acquisition flows (parking meters, restaurant ordering, event check-in).
Cross-Domain Connections
- Mobile + Edge Computing. Running ML models on-device (Core ML, TensorFlow Lite, MediaPipe) instead of server-side. Latency drops from 200ms (server round-trip) to 10ms (on-device). Privacy improves because data never leaves the device. Apple’s on-device processing for Siri, keyboard predictions, and photo face detection are examples.
- Mobile + Embedded Systems. Bluetooth LE communication with IoT devices (smart locks, health monitors, industrial sensors). The BLE stack on mobile is surprisingly complex — connection management, GATT service discovery, and MTU negotiation are all areas where mobile engineers need embedded-systems knowledge.
- Mobile + Accessibility. Accessibility is both a moral imperative and a legal requirement (ADA, WCAG). Senior mobile engineers treat accessibility as a first-class feature: semantic labels on every interactive element, dynamic type support, VoiceOver/TalkBack testing, sufficient color contrast. The accessibility APIs on iOS and Android are rich but under-utilized.
Emerging Trends
- Kotlin Multiplatform reaching maturity (2025-2026). With Google officially supporting KMP and Jetpack libraries adding KMP compatibility (Jetpack Room for KMP was announced in 2024), expect KMP to become the default for new projects that need cross-platform logic sharing.
- AI on-device. Apple Intelligence, Google Gemini Nano, and the broader push to run small language models on mobile. Core ML and ML Kit are evolving to support transformer models. The performance constraint is real — a 3B parameter model barely fits in 4GB of RAM.
- Spatial computing. Apple Vision Pro and the visionOS platform extend SwiftUI to 3D space. Even if spatial computing does not dominate consumer devices soon, the SwiftUI patterns for visionOS (windows, volumes, immersive spaces) will influence how we think about UI beyond flat screens.
- Privacy-first architecture. App Tracking Transparency (iOS 14+), Privacy Sandbox (Android), and increasing privacy regulation mean mobile apps must be designed for a world where device-level tracking is restricted. On-device attribution, differential privacy, and federated learning are replacing traditional analytics approaches.
Recommended Reading
Beginner
- Android Developers Guides — Google’s official documentation. Start with the architecture guide and the app lifecycle overview. Free, comprehensive, and kept up to date.
- Apple Human Interface Guidelines — Not just a design resource. Understanding Apple’s design philosophy helps you make architectural decisions (when to use a tab bar vs drawer, how to handle state restoration, when to use sheets).
- React Native New Architecture Guide — Essential reading if you are working with React Native. Explains JSI, Fabric, TurboModules, and Codegen with architectural diagrams.
Intermediate
- “Advanced iOS App Architecture” by raywenderlich.com (Kodeco) — Deep dive into MVVM, Clean Architecture, and coordinator patterns on iOS with production-quality code examples.
- Android Performance Patterns (YouTube series by Google) — Colt McAnlis’s series on memory, rendering, battery, and networking optimization. Each video is 5-10 minutes and packed with practical insights.
- “Offline First” by Greenrobot (Makers of ObjectBox/EventBus) — Pattern catalog for offline-first mobile architectures. Covers sync protocols, conflict resolution, and queue-based operations.
Advanced
- Martin Kleppmann’s “Designing Data-Intensive Applications” — Chapter 5 on replication and Chapter 9 on consistency and consensus directly apply to mobile sync and offline-first architecture. Not mobile-specific, but the concepts are foundational.
- The Signal Protocol Technical Documentation — If you want to understand end-to-end encryption implementation, this is the primary source. Covers X3DH key exchange, Double Ratchet, and Sender Keys.
- Shopify’s Mobile Engineering Blog — Real-world case studies from a team that adopted React Native at scale, including performance optimization, native module development, and their reasoning for cross-platform adoption.
Self-Assessment
Key Takeaways
- Mobile is a different engineering discipline, not “frontend for phones.” The constraints (battery, network, memory, app store gatekeeping) fundamentally change how you architect, test, and release software.
- MVVM is the right default architecture for most mobile apps. It is testable, lifecycle-aware, and has first-class framework support on both platforms. Use MVI for complex state, VIPER for very large teams, and MVC only for prototypes.
- The native vs cross-platform decision is a business decision, not a technical one. It depends on team composition, app complexity, platform API needs, and update velocity — not on which framework is “better.”
- Offline-first is an architecture, not a feature. If your app needs offline support, this decision must be made at the data layer from day one. Bolting it on later is painful.
- Push notifications are unreliable by design. Never use push as the sole delivery mechanism for critical information. Always have fallbacks.
-
Process death is the most under-tested scenario in mobile. If you are not testing with
adb shell am killand iOS background termination, you have bugs you do not know about. - Feature flags are your only rollback mechanism on mobile. You cannot un-ship a released app version. Feature flags let you disable broken features in minutes instead of waiting days for App Store review.