OPAQUE: An Asymmetric PAKE Protocol Secure Against Pre-Computation Attacks ? Stanislaw Jarecki1 , Hugo Krawczyk2 , and Jiayu Xu1 1 University of California, Irvine. Email: {stasio@ics.,jiayux@}uci.edu. 2 IBM Research. Email: hugo@ee.technion.ac.il. Abstract. Password-Authenticated Key Exchange (PAKE) protocols allow two parties that only share a password to establish a shared key in a way that is immune to offline attacks. Asymmetric PAKE (aPAKE) strengthens this notion for the more common client-server setting where the server stores a mapping of the password and security is required even upon server compromise; that is, the only allowed attack in this case is an (inevitable) offline exhaustive dictionary attack against individual user passwords. Unfortunately, current aPAKE protocols (that do not rely on PKI) allow for pre-computation attacks that lead to the instantaneous compromise of user passwords upon server compromise, thus forgoing much of the intended aPAKE security. Indeed, these protocols use – in essential ways – deterministic password mappings or use random “salt” transmitted in the clear from servers to users, and thus are vulnerable to pre-computation attacks. We initiate the study of Strong aPAKE protocols that are secure as aPAKE’s but are also secure against pre-computation attacks. We formalize this notion in the Universally Composable (UC) settings and present two modular constructions using an Oblivious PRF as a main tool. The first builds a Strong aPAKE from any aPAKE (which in turn can be constructed from any PAKE [26]) while the second builds a Strong aPAKE from any authenticated key-exchange protocol secure against KCI attacks. Using the latter transformation, we show a practical instantiation of a UC-secure Strong aPAKE in the Random Oracle model. The protocol (“OPAQUE”) consists of 3 messages, requires 3 and 4 exponentiations for server and client, respectively (including a multi-exponentiation and 1 or 2 fixed-base per party), provides forward secrecy and explicit mutual authentication, is PKI-free, supports user-side password hardening, has a built-in facility for password-based storage-and-retrieval of secrets and credentials, and accommodates a user-transparent server-side threshold implementation. 1 Introduction Passwords constitute the most ubiquitous form of authentication in the Internet, from the mundane to the most sensitive applications. The almost ? This is a revised ePrint version of the paper which appeared in Eurocrypt 2018 [33]. See revision notes in Sec. 1.2. universal password authentication method in practice relies on TLS/SSL and consists of the user sending its password to the server under the protection of a client-to-server confidential TLS channel. At the server, the password is decrypted and verified against a one-way image typically computed via hash iterations applied to the password and a random “salt” value. Both the password image and salt are stored for each user in a so-called “password file.” In this way, an attacker who succeeds in stealing the password file is forced to run an exhaustive offline dictionary attack to find users’ passwords given a set (“dictionary”) of candidate passwords. The two obvious disadvantages of this approach are: (i) the password appears in cleartext at the server during login1 ; and (ii) security breaks if the TLS channel is established with a compromised server’s public key (a widespread concern given today’s too-common PKI failures2 ). Password protocols have been extensively studied in the cryptographic literature – including in the above client-server setting where the user is assumed to possess an authentic copy of the server’s public key [27, 29], but the main focus has been on password-only protocols where the user does not need to rely on any outside keying material (such as public keys). The basic setting considers two parties that share the same low-entropy password with the goal of establishing shared session keys secure against offline dictionary attacks, namely, against an active attacker that possesses a small dictionary from which the password has been chosen. The only viable option for the attacker should be the inevitable online impersonation attack with guessed passwords. Such model, known as password-authenticated key exchange (PAKE), was first studied by Bellovin and Merritt [7] and later formalized by Bellare et al. [6] in the game-based indistinguishability approach. Canetti et al. [15] formalized PAKE in the Universally Composable (UC) framework [14], which better captures PAKE security issues such as the use of arbitrary password distributions, the inputting of wrong passwords by the user, and the common practice of using related passwords for different services. Whereas the cryptographic literature on PAKE’s focuses on the above basic setting, in practice the much more common application of password protocols is in the client-server setting. However, sharing the same password between user and server would mean that a break to the server leaks plaintext passwords for all its users. Thus, what’s needed is that upon a server compromise, and the stealing of the password file, an attacker is forced to perform an exhaustive offline dictionary attack as in the above TLS scenario. No other attack, except for an inevitable online guessing attack, should be feasible. In particular, the 1 See [2, 3] for examples of the detrimental effect of such exposure even for servers that are not under malicious attack. 2 PKI failures include stealing of server private keys, software that does not verify certificates correctly, users that accept invalid or suspicious certificates, certificates issued by rogue CAs, servers that share their TLS keys with others – e.g., CDN providers or security monitoring software, information (including passwords) that traverses networks in plaintext form after TLS termination; and more. 2 two main shortcomings of password-over-TLS mentioned earlier - reliance on public keys and exposure of the password to the server - need to be eliminated. This setting, known as aPAKE, for asymmetric PAKE (also called augmented or verifier-based), was introduced by Bellovin and Merrit [8], later formalized in the simulation-based approach by Boyko et al. [13], and in the UC framework by Gentry et al. [26]. Early protocols proven in the simulation-based model include [13,42,43]. Later, Gentry et al. [26] presented a compiler that transforms any UC- PAKE protocol into a UC-aPAKE (adding an extra round of communication and a client’s signature). This was followed by [34] who show the first simultaneous one-round adaptive UC-aPAKE protocol. In addition, several aPAKE protocols targeting practicality have been proposed, most with ad-hoc security arguments, and some have been (and are being) considered for standardization (see below). A common deficiency of all these aPAKE protocols, including those being proposed for practical use, is that they are all vulnerable to pre-computation attacks. Namely, the attacker A can pre-compute a table of values based on a passwords dictionary D, so as soon as A succeeds in compromising a server it can instantly find a user’s password. This weakens the benefits of security against server compromise that motivate the aPAKE notion in the first place. Moreover, while current definitions require that the attacker cannot exploit a server compromise without incurring a workload proportional to the dictionary size |D|, these definitions allow all this workload to be spent before the actual server compromise happens. Indeed, this weakening in the existing aPAKE security definition [26] is needed to accommodate aPAKE protocols that store a one-way deterministic mapping of the user’s password at the server, say H(pw). Such protocols trivially fall to a pre-computation attack as the attacker A can build a table of (H(pw), pw) pairs for all pw ∈ D, and once it compromises the server, it finds the value H(pw) associated with a user and immediately, in log(|D|) time, finds that user’s password. Such devastating attack can be mitigated by “personalizing” the password map, e.g., hashing the password together with the user id. This forces A to pre-compute separate tables for individual users, yet all this effort can still be spent before the actual server compromise. Note that the standard password-over-TLS scheme prevents pre-computation by hashing passwords with a random salt visible to the server only. In contrast, existing aPAKE protocols that do not rely on PKI, either don’t use salt or if they do, the salt is transmitted from server to user during login in the clear3 . Given that password stealing via server compromise is the main avenue for collecting billions of passwords by attackers, the above vulnerability of existing aPAKE protocols to pre-computation attacks is a serious flaw (particularly applicable to targeted attacks), and in this aspect password-over-TLS is more secure than all known aPAKE schemes. 3 Note that even if the aPAKE protocol runs over TLS, the transmitted salt is open to a straightforward active attack. 3 1.1 Our contributions We initiate the study of Strong aPAKE (SaPAKE) protocols that strengthen the aPAKE security notion by disallowing pre-computation attacks. We formalize this notion in the Universally Composable (UC) model by modifying the aPAKE functionality from [26] to eliminate an adversarial action which allowed such pre- computation attacks. As we explain above, allowing pre-computation attacks was indeed necessary to model the security of existing aPAKE protocols. The next contribution is building Strong aPAKE (SaPAKE) protocols. For this we present two generic constructions. The first (Section 4) builds the SaPAKE protocol from any aPAKE protocol (namely one that satisfies the original definition from [26]) so that one can “salvage” existing aPAKE protocols. To do so we resort to Oblivious PRF (OPRF) functions [25, 31], namely, a PRF with an associated two-party protocol run between a server S that stores a PRF key k and a user U with a password pw. At the end of the interaction, U learns the PRF output Fk (pw) and S learns nothing (in particular, nothing about pw). We show that by preceding any aPAKE protocol with an OPRF interaction in which U computes the value rw = Fk (pw) with the help of S and uses rw as the password in the aPAKE protocol, one obtains a Strong aPAKE protocol. We show that if the OPRF and the given aPAKE protocol are, respectively, UC realizations of the OPRF functionality we present (based on [31]) and the original aPAKE functionality from [26], the resultant scheme realizes our UC functionality FsaPAKE . Our second transformation (Section 5) consists of the composition of an OPRF with a regular authenticated key exchange protocol AKE. We require UC security (with forward secrecy) for the AKE protocol as well as resistance to KCI attacks. The latter is a common feature of public-key AKE protocols that ensures that an attacker that learns the secret keys of one party P, but does not actively control P, cannot use this information to impersonate another party P0 to P. In our OPRF-AKE composition, U first runs the OPRF with S to compute rw = Fk (pw); then it runs the AKE protocol with S using a private key stored, encrypted under rw, at S who sends it to U. Crucial to the security of the protocol is a “random-key robustness” property of the encryption function that can be ensured, for example, by adding an HMAC to a symmetric encryption scheme. Additionally, we assume the encryption to be formally “equivocable” which requires random-oracle, or ideal cipher, modeling of the encryption function (these requirements can be relaxed by a variant of OPAQUE that dispenses with the encryption of the user’s private key). We show that the aPAKE scheme resultant from the above composition realizes our UC functionality FsaPAKE . We use the above second transformation to instantiate a Strong aPAKE protocol with a very efficient OPRF and any efficient UC-secure AKE with the KCI property. The OPRF scheme we use, essentially a Chaum-type blinded DH computation, has been proven UC-secure by Jarecki et al. [30, 31]. We show that this OPRF scheme, which we call DH-OPRF(called 2HashDH in [30, 31]), remains secure in spite of changes to the OPRF functionality that 4 we introduce for supporting a stronger OPRF notion needed in our setting. We call the result of this instantiation, the OPAQUE protocol. OPAQUE combines the best properties of existing aPAKE protocols and of the standard password-over-TLS. On the one hand, it greatly improves on TLS by not relying on PKI and never exposing the cleartext password to the server. At the same time, it resolves the major flaw of existing aPAKE protocols relative to password-over-TLS, namely, their vulnerability to pre-computation attacks. In addition, OPAQUE offers significant features for practical deployment. Its modularity allows for its use with different key-exchange schemes that can provide different features and performance tradeoffs. When implemented with the DH-OPRF scheme, its cost is one exponentiation for the server and two for the client4 in addition to the KE protocol cost which can be as little as 2.17 exponentiations per party using HMQV [38] (with full forward secrecy and mutual explicit authentication). The OPRF messages are piggy-backed on those of the AKE protocol for a total of three messages. Another feature of OPAQUE is that it allows for client-side hardening (via hashing or memory-hard functions) that increases the cost of offline dictionary attacks upon server compromise as well as the cost of online guessing attacks. In Fig. 12 in Section 6 we show an instantiation of OPAQUE in the RO model with HMQV as the AKE. Compared to the practical aPAKE protocols that have been and are being considered for standardization (cf., [1, 48]), OPAQUE fares clearly better on the security side as the only scheme that offers resistance to pre-computation attacks (other aPAKE protocols can be made “strong” by using our methodology from Section 4). Performance-wise, OPAQUE is competitive with the more efficient among these protocols (see Section 6). OPAQUE also provides a unique functionality among aPAKE protocols in that it allows to store and retrieve user’s secrets such as a bitcoin wallet, authentication credentials, encrypted backup keys, etc., thus offering a far more secure alternative to the practice of deriving low-entropy secrets directly from a user’s password. Furthermore, OPAQUE allows for a user-transparent server-side threshold implementation [32] where the only exposure of the user password - or any stored secrets - is in case a threshold of servers is compromised and even then a full dictionary attack is required. Finally, we comment that while OPAQUE can completely replace password authentication in TLS, it can also be used in conjunction with TLS for protecting account information, for bootstrapping TLS client authentication (via an OPAQUE-retrieved client signing key), or as an hedge against PKI failures. In other words, while we are accustomed to use TLS to protect passwords, OPAQUE can be used to protect TLS. We expand on this aspect in Section 6.2. Prior protocol variants. OPAQUE is not a completely new protocol. Variants have been studied in prior work in several settings but none of these works presents a formal analysis of the protocol as an aPAKE, let alone as a Strong 4 A variant of the protocol discussed in Section 6.3 allows one or both of the client’s exponentiations to be fixed-base and offline. 5 aPAKE, a notion that we introduce here for the first time. While our treatment frames OPAQUE in the context of Oblivious PRFs [30, 31], its design can be seen as an instantiation of the Ford-Kaliski paradigm for password hardening and credential retrieval using Chaum’s blinded exponentiation. Most related is Boyen’s work [12] which specifies and studies an instantiation of the protocol (called HPAKE) in the setting of client-side halting KDF [11]. Jarecki et al. [30, 31] study a threshold version (also using the OPRF abstraction) in the context of password-protected secret sharing (PPSS) protocols. Because of the relation between PPSS and Threshold PAKE protocols [30], this analysis implies security of OPAQUE as a PAKE protocol in the BPR model [6] but not as an aPAKE (let alone as a strong aPAKE). 1.2 Revision Notes The current version of this paper introduces substantial revisions in the formal treatment relative to the proceedings version [33]. The protocol specification has not changed except that the necessity for forward secrecy made explicit here implies that OPAQUE cannot be implemented with a 2-message protocol as depicted in [33] (see footnote 13 in page 48). We list the main formal changes here with details provided in the corresponding sections. 1. We strengthen (Section 5.1) the AKE functionality with adaptive security for both client and server (in [33] we only did it on the server side) which, in particular, ensures forward secrecy as needed for the security of the OPAQUE protocol. 2. We relaxed the Strong aPAKE functionality in three ways (Section 2). First, in the event that the attacker guesses the password correctly, we consider sessions that were created but not completed before the guessing event as compromised, but only if the attacker actively interfered in these sessions. Second, the attacker is allowed one password guess after a session is completed (if it guesses correctly, it learns that the guess is correct but it does not learn the key output by the completed session). Third, the attacker can test if a user’s session runs on a password matching the server’s even if the server does not run a sub-session whose ID matches the tested user’s sub-session. While these relaxations are necessary for proving security, they do not seem to earn the attacker any practical advantages. 3. The composition of OPRF and AKE functionalities that form the OPAQUE scheme require a binding mechanism between the OPRF and AKE sub- sessions which we implement by including in the AKE’s session identifier a prefix of the OPRF session transcript (e.g., for DH-OPRF function, this prefix is defined as the user’s initial value a). In order to formalize this mechanism we extend the OPRF functionality in Section 3 to output a prefix of the function’s transcript. 4. We identify the requirement for the authenticated encryption protecting the user’s private key stored at S to be equivocable (Section 5.2). 6 5. Pending revision: The material in Section 4 (compiler from aPAKE to strong aPAKE) will be revised shortly to adapt the proof of Theorem 1 to the modified OPRF and SaPAKE functionalities. 2 The Strong aPAKE Functionality In this section we present the ideal Strong aPAKE UC functionality, FsaPAKE , shown in Fig. 2, that will serve as our definition of Strong aPAKE security; namely, we will call a protocol a secure Strong aPAKE if it realizes FsaPAKE . Functionality FsaPAKE is a relaxation of the Strong aPAKE notion defined in the proceedings version of this paper [33] to which we refer as FsaPAKE+ .5 Functionality FsaPAKE and its predecessor FsaPAKE+ build on the UC asymmetric PAKE (aPAKE) functionality FaPAKE defined by Gentry et al. [26]6 which, in turn, adapts the (symmetric) UC PAKE defined by Canetti et al. [15] to the asymmetric case. We present FsaPAKE in three stages: First, we explain how the UC functionality FaPAKE used to define asymmetric PAKE security in [26] builds on the symmetric UC PAKE notion of [15]. Then we show how functionality FsaPAKE+ strengthens the definition from [26] to eliminate the vulnerability to pre-computation attacks. Lastly, we derive our UC Strong aPAKE functionality FsaPAKE by modifying FsaPAKE+ to account for adversarial behavior which is possible against our main protocol OPAQUE from Section 5 but, as we argue, does not compromise practical security in any significant way. Asymmetric PAKE. The aPAKE functionality of [26], denoted here FaPAKE and shown in Fig. 1 (full text), extends the symmetric UC PAKE functionality from [15] to the asymmetric (user-server) setting. First, in an aPAKE scheme the server and the user run different programs: The user runs an aPAKE session on a password (via command UsrSession) while the server runs it on a “password file” file[sid] that represents server’s user-specific state corresponding to the user’s password, e.g., a password hash, which the server creates on input the user’s password via command StorePwdFile during aPAKE initialization. Furthermore, FaPAKE models a possible compromise of a server, via command StealPwdFile, from which the attacker obtains file[sid]. Such compromise subsequently allows the attacker to (1) impersonate the server to the user, via command Impersonate, and (2) find the password via an offline dictionary attack, modeled by command OfflineTestPwd. The way functionality FaPAKE of [26] handles the offline dictionary attack is the focus of the Strong aPAKE functionality we define, and which we discuss next. Strong aPAKE vs. aPAKE. As discussed in the introduction, the aPAKE functionality from [26] is too weak to ensure resistance to offline dictionary attacks. Protocols proven secure in that model may allow an attacker to build 5 Functionality FsaPAKE+ was denoted FsaPAKE in [33], but we argue that it is stronger than necessary, hence we denote it here with the plus sign. 6 Functionality FaPAKE was denoted FapwKE in [26]. 7 In the description below, we assume P ∈ {U, S}. Password Registration – On (StorePwdFile, sid , U, pw) from S, if this is the first StorePwdFile message, record hfile, U, S, pwi and mark it uncompromised. Stealing Password Data – On (StealPwdFile, sid ) from A∗ , if there is no record hfile, U, S, pwi, return “no password file” to A∗ . Otherwise, if the record is marked uncompromised, mark it compromised; regardless, • If there is a record hoffline, pwi, send pw to A∗ . • Else Return “password file stolen” to A∗ . – On (OfflineTestPwd, sid , pw∗ ) from A∗ , do: • If there is a record hfile, U, S, pwi marked compromised, do: if pw∗ = pw, return “correct guess” to A∗ ; else return “wrong guess.” • Else record hoffline, pwi. Password Authentication – On (UsrSession, sid , ssid , S, pw0 ) from U, send (UsrSession, sid , ssid , U, S) to A∗ . Also, if this is the first UsrSession message for ssid , record hssid , U, S, pw0 i and mark it fresh. – On (SvrSession, sid , ssid ) from S, retrieve hfile, U, S, pwi, and send (SvrSession, sid , ssid , U, S) to A∗ . Also, if this is the first SvrSession message for ssid , record hssid , S, U, pwi and mark it fresh. Active Session Attacks – On (TestPwd, sid , ssid , P, pw∗ ) from A∗ , if there is a record hssid , P, P0 , pw0 i marked fresh, do: if pw∗ = pw0 , mark it compromised and return “correct guess” to A∗ ; else mark it interrupted and return “wrong guess.” – On (Impersonate, sid , ssid ) from A∗ , if there is a record hssid , U, S, pw0 i marked fresh, do: if there is a record hfile, U, S, pwi marked compromised and pw0 = pw, mark hssid , U, S, pw0 i compromised and return “correct guess” to A∗ ; else mark it interrupted and return “wrong guess.” Key Generation and Authentication – On (NewKey, sid , ssid , P, SK ∗ ) from A∗ where |SK ∗ | = `, if there is a record hssid , P, P0 , pw0 i not marked completed, do: • If the record is compromised, or P or P0 is corrupted, set SK := SK ∗ . • Else if the record is fresh, a (sid , ssid , SK 0 ) tuple was sent to P0 , and at that time there was a record (ssid , P0 , P) marked fresh, set SK := SK 0 . • Else pick SK ←R {0, 1}` . Finally, mark hssid , P, P0 , pw0 i completed and send (sid , ssid , SK) to P. – On (TestAbort, sid , ssid , P) from A∗ , if there is a record hssid , P, P0 , pw0 i not marked completed, do: • If it is fresh and there is a record hssid , P0 , P, pw0 i, send Succ to A∗ . • Else send Fail to A∗ and (abort, sid , ssid ) to P, and mark hssid , P, P0 , pw0 i completed. Fig. 1: Functionalities FaPAKE (full text) and FsaPAKE+ ( shadowed text omitted) 8 Password Registration – On (StorePwdFile, sid , U, pw) from S, if this is the first such message, record hfile, U, S, pwi, mark it uncompromised, set flag := uncompromised. Stealing Password Data – On (StealPwdFile, sid ) from A∗ , if there is no record hfile, U, S, pwi, return “no password file” to A∗ . Otherwise, if the record is marked uncompromised, mark it compromised; regardless, return “password file stolen” to A∗ . – On (OfflineTestPwd, sid , pw∗ ) from A∗ , if there is a record hfile, U, S, pwi marked compromised, do: if pw∗ = pw, return “correct guess” to A∗ and set flag := compromised ; otherwise return “wrong guess.” Password Authentication – On (UsrSession, sid , ssid , S, pw0 ) from U, send (UsrSession, sid , ssid , U, S) to A∗ . Also, if this is the first UsrSession message for ssid , record hssid , U, S, pw0 i and mark it fresh. – On (SvrSession, sid , ssid ) from S, retrieve hfile, U, S, pwi, and send (SvrSession, sid , ssid , U, S) to A∗ . Also, if this is the first SvrSession message for ssid , record hssid , S, U, pwi and mark it fresh. Active Session Attacks – On (Interrupt, sid , ssid , S) from A∗ , if there is a record hssid , S, U, pwi marked fresh, mark it interrupted and set dPT(ssid ) := 1. – On (TestPwd, sid , ssid , P, pw∗ ) from A∗ , retrieve record hssid , P, P0 , pw0 i and: • If the record is fresh then do the following: If pw∗ = pw0 return “correct guess” to A∗ and mark hssid , P, P0 , pw0 i compromised, otherwise return “wrong guess” and mark hssid , P, P0 , pw0 i interrupted. • If P = S and dPT(ssid ) = 1 then set dPT(ssid ) := 0 and if pw∗ = pw0 then return “correct guess” to A∗ else return “wrong guess.” In either case, if P = S and pw∗ = pw0 then set flag := compromised. – On (Impersonate, sid , ssid ) from A∗ , if there is a record hssid , U, S, pw0 i marked fresh, do: If there is a record hfile, U, S, pwi marked compromised and pw0 = pw, mark hssid , U, S, pw0 i compromised and return “correct guess” to A∗ ; otherwise mark it interrupted and return “wrong guess.” Key Generation and Authentication – On (NewKey, sid , ssid , P, SK ∗ ) from A∗ where |SK ∗ | = `, if there is a record hssid , P, P0 , pw0 i not marked completed, do: • If the record is compromised, or (P = S, the record is interrupted and flag = compromised) , or either P or P0 is corrupted, set SK := SK ∗ . • Else, if the record is fresh and (sid , ssid , SK 0 ) was sent to P0 at the time there was a record hssid , P0 , P, pw0 i marked fresh, set SK := SK 0 . • Else pick SK ←R {0, 1}` . Finally, mark hssid , P, P0 , pw0 i completed and send (sid , ssid , SK) to P. – On (TestAbort, sid , ssid , P) from A∗ , if there is a record hssid , P, P0 , pw0 i not marked completed, do: • If it is fresh and there is a record hssid , P0 , P, pw0 i send Succ to A∗ . • If it is fresh, P0 = S, and pw0 = pw, send Succ to A∗ . • Otherwise send Fail to A∗ and (abort, sid , ssid ) to P, and mark hssid , P, P0 , pw0 i completed. Fig. 2: Functionality FsaPAKE with marked additions relative to FsaPAKE+ 9 a dictionary prior to compromising a server so that when the compromise occurs it immediately finds the user’s password (moreover, all previous protocols proven under this functionality were indeed subject to such pre-computation attack). In contrast, our UC Strong aPAKE functionality FsaPAKE disallows such attacks. To explain how, we first consider functionality FsaPAKE+ that serves as a nexus between FaPAKE and FsaPAKE . Functionality FsaPAKE+ , presented in Fig. 1 (gray text omitted), was used to define Strong aPAKE in the proceedings version of this paper [33] but it turns out to be too restrictive as we explain below. As depicted in the figure, the only formal difference between functionalities FaPAKE and FsaPAKE+ are in the actions upon the stealing of the password file. Specifically, FsaPAKE+ omits recording the hoffline, pwi pairs and does not allow for OfflineTestPwd queries made before the StealPwdFile query. Let us explain. In FsaPAKE+ , the actions upon server compromise, i.e., StealPwdFile, are simple. First, a flag is defined to mark that the password file has been compromised. Second, once this event happens, the adversary is allowed to submit password guesses and be informed if a guess was correct. Note that each guess “costs” the attacker one OfflineTestPwd query. This together with the restriction that these queries can only be made after the password file is compromised ensure that shortcuts in finding the password after such compromise are not possible, namely that the attacker needs to pay with one OfflineTestPwd query for each password it wants to test. Thus, pre-computation attacks are made infeasible. Now, consider the FaPAKE functionality [26] which includes the text in gray too. This functionality allows the attacker, via hoffline, pwi records, to make guess queries against the password even before the password file is compromised. The restriction is that the responses to whether a guess was correct or not are provided to the attacker only after a StealPwdFile event. But note that if one of these guesses was correct, the attacker learns it immediately upon server compromise. This provision was necessary in [26] to prove their aPAKE protocol that included in the file[sid] a deterministic publicly-computable hash of the password, thus allowing for a pre-computation attack which lets the adversary instantaneously identify the password with a single table lookup upon server compromise. Indeed, one can think of the pairs hoffline, pwi in the original FaPAKE functionality as a pre-computed table that the attacker builds overtime and which it can use to identify the password as soon as the server is compromised. By eliminating the ability to get guesses hoffline, pwi answered before server compromise in our FsaPAKE+ functionality, we make such pre-computation attacks infeasible in the case of a Strong aPAKE. Modeling Server Compromise and Offline Dictionary Queries. In FsaPAKE+ as in FaPAKE , StealPwdFile and OfflineTestPwd messages from A∗ to FsaPAKE+ are accounted for by the environment. This is consistent with the UC treatment of adaptive corruption queries and is crucial to our modeling. Note that if the environment does not observe adaptive corruption queries then the ideal model adversary, i.e., the simulator, could immediately 10 corrupt all parties at the beginning of the protocol, learning their private inputs and thus making the work of simulation straightforward. By making the player-corruption queries, modeled by StealPwdFile command in our context, observable by the environment, we ensure that the environment’s view of both the ideal and the real execution includes the same player-corruption events. This way we keep the simulator “honest,” because it can only corrupt a party if the environment accounts for it. The same concern pertains to offline dictionary queries OfflineTestPwd, because if they were not observable by the environment, the ideal adversary could make such queries even if the real adversary does not. In particular, without environmental accounting for these queries the FaPAKE and FsaPAKE+ functionalities would be equivalent because the simulator could internally gather all the offline dictionary attack queries made by the real-world adversary before server corruption, and it would send them all via the OfflineTestPwd query to FsaPAKE+ after server corruption via the StealPwdFile query. Such simulator would make the ideal-world view indistinguishable from the real-world view to the environment if the environment does not observe the sequence of OfflineTestPwd and StealPwdFile queries. Finally, we note that FsaPAKE+ (following FaPAKE ) has effectively two separate notions of a server corruption. Formally, it considers a static adversarial model where all entities, including users and servers, are either honest or corrupt throughout the life-time of the scheme. However, it adds a crucial adaptive capability to the attacker against honest servers via the StealPwdFile action. It results in leaking to the adversary the server’s private state corresponding to a particular password file, but it does not give the adversary full control over the server’s entity. In particular, the accounts on the same server for which the adversary does not explicitly issue the StealPwdFile command must remain unaffected. We adopt this convention from [26] and we call a server “corrupted” if it is (statically) corrupt and adversarially controlled, and we call an aPAKE instance “compromised” if the adversary steals its password file from the server. Functionality FsaPAKE : Functionality FsaPAKE+ captures the core notion of security against pre-computation attacks for asymmetric PAKE protocols. It also satisfies the essential property of PAKE protocols (asymmetric and symmetric) that an attacker can only test one password guess per PAKE session. However, this functionality is stronger than necessary in three ways. First, it requires that an attacker who guesses a password cannot compromise any existing session, not even those where the adversary actively tried to impersonate a user, namely, incomplete interrupted sessions. This choice was made by all prior UC PAKE functionality variants, symmetric [15] and asymmetric [26], but it is an over-restrictive condition relative to definitions of secure key-exchange protocols that do not guarantee the security of incomplete sessions with active corruptions (e.g., if a party to a session is corrupted while the session is open, no security is guaranteed for that session). 11 Secondly, functionality FsaPAKE+ (as previous UC PAKE functionalities) requires that a password tested in an active attack is efficiently extractable (from adversary’s computation) before the attacked session completes. This requirement cannot be satisfied by an aPAKE protocol where the user’s messages are committing to a single tested password, but this password is extractable only from the user’s computation after the protocol completes on the server side. In this case the attacker cannot influence the computation of the session key but it can still learn if its password guess is correct. Third, functionality FsaPAKE+ , like FaPAKE [26], lets a man-in-the-middle adversary observe whether some U and S sessions share the same password, by “connecting” these two sessions, i.e. routing the messages back and forth between them, and observing if either U or S aborts (which implies pw0 6= pw) or not (which might imply pw0 = pw). This is indeed a necessary information leakage in any (sa)PAKE protocol where either party implements explicit entity authentication, and aborts if the peer fails to authenticate. In UC aPAKE functionality FaPAKE of [26] this password-matching test was modeled with interface TestAbort which on input (sid , ssid , P) replied Succ if and only if, for the tested session hssid , P, P0 , pwi there is a record of its counterparty P0 running a session ssid 0 , P0 , P, pw0 s.t. ssid 0 = ssid and pw0 = pw. While this sub-session-specific password-equality verification makes sense in the case of U sessions, whose each sub-sessions might run on a different password, it seems an overkill in the case of S sub-sessions, which all run on the same password file, a one-way function of a fixed password pw. In our context, either requirement prevents proving security of the protocols obtained via our general compiler from Section 5, including the OPAQUE protocol from Section 6. For this reason we relax FsaPAKE+ to obtain our definition of UC Strong aPAKE functionality FsaPAKE , presented in Fig. 2. Changes with respect to FsaPAKE+ are highlighted with shadowed background. FsaPAKE preserves the essential guarantees of FsaPAKE+ , i.e. the same level of security against server compromise, the ability of the attacker to try at most one password guess per session, and the security of all sessions that complete while the user’s password is uncompromised. However, FsaPAKE relaxes the FsaPAKE+ in three ways: First, it does not guarantee the security of sessions in which the attacker actively interferes and which are not completed at the time the attacker learns the password. Second, it allows for passwords tested by the attacker in a given session to be extractable only after the session completes. Third, it allows TestAbort to test if any U sub-session runs on the server’s password without S running a sub-session with a matching ID. The first relaxation is modeled by adding a flag, denoted flag, which indicates whether the user’s password is guessed by the attacker via a TestPwd query against some session of server S. When that occurs flag is set to compromised and every actively attacked session of S, i.e. one whose record is set to either interrupted or compromised, that completes when flag = compromised, is treated as compromised and the attacker can set its session key. 12 The second relaxation is modeled by adding a new query Interrupt, which models an active attack against S session in which the adversary does not contribute the password guess pw∗ straight away. Session (S, ssid ) attacked in this way is marked with flag dPT(ssid ) set to 1, which allows the adversary to make a delayed password test on this session. Namely, if dPT(ssid ) = 1 then an adversary can make one TestPwd query against session (S, ssid ) regardless of its status, i.e. either before this session completes via NewKey or after. In the first case, i.e. if TestPwd is sent before the session completes and the password guess pw∗ is correct then the adversary learns that pw∗ is correct, flag is set to compromised, and all interrupted and still active S sessions, including (S, ssid ), are treated as compromised so the attacker can set their session keys when they complete. In the second case, i.e. if TestPwd is sent after the session completes, then the only thing the adversary gets is the information whether the password guess pw∗ was correct or not, but it cannot either influence or learn the session key output by (S, ssid). Either way the TestPwd query sets flag dPT(ssid ) to 0, and since the session is then either interrupted or completed, no new password query can be made against it. Note that the only new attack this mechanism allows for is that an active attacker can make a “postponed” TestPwd query against this session, when the session already completes, rather than making it before. However, the password guess pw∗ is committed at the time of the active attack, because only one pw∗ can be tested against the session; and since making the TestPwd query after the session completes gives less power to the attacker than making it before the session completes, and the latter is allowed by the standard (a)PAKE model, this new attack avenue seems to offer no advantages to the attacker. The third relaxation is modeled by allowing TestAbort on U’s sub-session to reply Succ if U’s input pw0 matches S’s password pw, and the test proceeds whether or not S runs a sub-session with a matching ssid . Non-black-box Assumptions. Note that the aPAKE functionality requires the simulator, playing the role of the ideal-model adversary, to detect offline password guesses made by the real-world adversary. As pointed out by [26], this seems to require a non-black-box hardness assumption on some cryptographic primitive, e.g., the Random Oracle Model (ROM), which would allow the simulator to extract a password guess from adversary’s local computation, e.g., a local execution of aPAKE interaction on a password guess and a stolen password file. Server Initialization. We note that while FaPAKE defines password registration as an internal action of server S, with the user’s password as a local input, one can modify it to support an interactive procedure between user and server, e.g., to prevent S from ever learning the plaintext password. Such interactive initialization, would require a change in the FsaPAKE functionality to allow the user to build the password file and send it to the server. For simplicity, and in order to minimize changes relative to [26], we keep the non-interactive functionality. See more about interactive initialization in Section 6.1. 13 Public Parameters: PRF output-length `, polynomial in security parameter τ . Conventions: For every i, x, value Fsid,i (x) is initially undefined, and if undefined value Fsid,i (x) is referenced then FOPRF assigns Fsid,i (x) ←R {0, 1}` . Initialization On message (Init, sid ) from party S, if this is the first Init message for sid , set tx = 0 and send (Init, sid , S) to A∗ . From now on use tag “S” to denote the unique entity which sent the Init message for the session identifier sid . (Ignore all subsequent Init messages for sid .) Server Compromise On (Compromise, sid , S) from A∗ , declare server S as compromised. (If S is corrupted then it is declared compromised from the beginning.) Note: Message (Compromise, sid , S) requires permission from the environment. Offline Evaluation On (OfflineEval, sid , i, x) from P ∈ {S, A∗ }, send (OfflineEval, sid , Fsid,i (x)) to P if any of the following hold: (i) S is corrupted, (ii) P = S and i = S, (iii) P = A∗ and i 6= S, (iv) P = A∗ and S is compromised. Online Evaluation – On (Eval, sid , ssid , S0 , x) from P ∈ {U, A∗ }, send (Eval, sid , ssid , P, S0 ) to A∗ . On prfx from A∗ , ignore this message if prfx was used before. Else record hssid , P, x, prfxi and send (Prefix, sid , ssid , prfx) to P. – On (SndrComplete, sid , ssid 0 ) from S, send (SndrComplete, sid , ssid 0 , S) to A∗ . On prfx0 from A∗ , send (Prefix, sid , ssid 0 , prfx0 ) to S. If there is a record hssid , P, x, prfxi for P 6= A∗ and prfx = prfx0 , change it to hssid , P, x, OKi, else set tx++. – On (RcvComplete, sid , ssid , P, i) from A∗ , ignore this message if there is no record hssid , P, x, prfxi or if (i = S, tx = 0, and prfx 6= OK). Else send (Eval, sid , ssid , Fsid,i (x)) to P, and if (i = S and prfx 6= OK) then set tx−−. Fig. 3: Functionality FOPRF with Adaptive Compromise 3 Oblivious Pseudorandom Function An Oblivious Pseudorandom Function scheme (OPRF) is a central tool in all our constructions. An OPRF consists of a pseudorandom function family F with an associated two-party protocol executed between a server that holds a key k for F and a user with an input x. At the end of the interaction, the user learns the PRF output Fk (x) and nothing else, and the server learns nothing (in particular, nothing about x). The notion of OPRF was introduced in [25]. The first UC formulation of it was given in [30], including a verifiability property that lets the user check the correct behavior of the server during the OPRF execution. Later [31] gave an alternative UC definition of OPRF which dispensed with the verifiability property, allowing for more efficient instantiations. The main idea in the OPRF formulations of [30, 31] is the use of a ticketing mechanism that ensures that the number of input values on which anyone can compute the OPRF 14 on a key held by an honest server S is no more than the number of executions of the OPRF recorded by S. This mechanism dispenses with the need to extract users’ inputs as is typically needed in UC simulations and it allows much more efficient OPRF instantiations. Here we adopt the UC OPRF formulation from [31] as the basis for our definition of adaptively secure OPRF functionality FOPRF , presented in Fig. 3. We refer to [31, 33] for detailed rationale for this functionality, but we note that it requires PRF outputs to be pseudorandom even to the owner of the PRF key k. This does not seem achievable under non-black-box assumptions, but it is achievable, indeed very efficiently, in the Random Oracle Model (ROM). (In Appendix B we show that the DH-OPRF scheme of [31] realizes FOPRF in ROM under the same One More Diffie-Hellman assumption which was used to show that the same protocol realizes the UC OPRF formalization of [31].) Note that the reliance on non-black-box assumptions like ROM is called for in the aPAKE UC context, see Section 2. The UC OPRF definition shown in Fig. 3 is a revision of the definition given in [33]. In Appendix A we include detailed explanations of the differences between the UC OPRF of Fig. 3 and the UC OPRF defined in proceedings version of this paper [33], as well as the differences between both of the above definitions and the UC OPRF definition given in [31]. OPRF Transcript Prefixes. As we explain in Appendix A almost all changes between UC OPRF formulation in Fig. 3 and the corresponding formulation in the proceedings version of this paper [33] are syntactic. However, there is one modification which is more than syntactic, which is that here we extend the UC OPRF functionality so that both user and server sessions have additional local output, which is a prefix of the OPRF protocol transcript. The protocol transcript is of course application dependent, and how much of it counts as a prefix can depend on the implementation, but the functionality FOPRF does not care what these prefixes are except for the following constraint: If some subsession of server S shares a protocol prefix with some subsession of user U, then the only party which can compute function Fk (·) on some input x due to this interaction is that U’s subsession, and not e.g. the adversary. This property is not overly restrictive and indeed it is expected in any implementation of OPRF. It means simply that if the man-in-the-middle adversary forwards the messages exchanged between some U and S subsessions until some point (that point defined how much of the OPRF transcript counts as its prefix), then the adversary can no longer stage an active attack on S, and use it to compute Fk (x) for x of its choice. The only choices the adversary has in that case is to let U and S continue their interaction, which lets U compute Fk (·) on U’s input, or interfere with it in a way which prevents everyone from computing Fk (·) on any input in that subsession. In the OPRF scheme DH-OPRF from [31], recalled in Appendix B, which realizes FOPRF under the One More Diffie-Hellman assumption, the role of that prefix is played by the user’s message a. Indeed, it is easy to see that an active attacker cannot 15 succeed in computing Fk (x) on x of its choice if it forwards some honest U’s message a to S instead of replacing it with its own message a0 . This “no successful active attack if transcript prefixes match” property of OPRF is used in the proof of security of the Strong aPAKE protocol OPAQUE, shown in the generic form in Fig. 8 in Section 5. In Section 5.2 we include a more detailed explanation of how the security argument for OPAQUE uses this property, but a general intuition is that this property forces the man-in-the- middle attacker against an OPRF scheme to decide, for every S session, whether to (I) make this session potentially useful for some U session or (II) make this S session useful only to the adversary. (This is the consequence of making S and U transcript prefixes either match or not match.) This can be useful in a higher- level protocol (e.g., OPAQUE) because it allows the simulator of this protocol decide whether a given S session can be “connected” to some honest U (case I) or it is actively attacked (case II). 4 A Compiler from aPAKE to Strong aPAKE via OPRF Under revision. This section is to be revised to adapt the proof to the modified SaPAKE functionality from Fig.2 and OPRF functionality from Fig. 3. We’ll do so shortly. In Fig. 4 we specify a compiler that transforms any OPRF and any aPAKE into a Strong aPAKE protocol. In UC terms the Strong aPAKE protocol is defined in the (FOPRF , FaPAKE )-hybrid world, for FOPRF with the output length parameter ` = 2τ . The compiler is simple. First, the user transforms its password pw into a randomized value rw by interacting with the server in an OPRF protocol where the user inputs pw and the server inputs the OPRF key. Nothing is learned at the server about pw (i.e., rw is indistinguishable from random as long as the input pw is not queried as input to the OPRF). Next, the user sets rw as its password in the given aPAKE protocol. Note that since the password rw is taken from a pseudorandom set, then even if the size of this set is the same as the original dictionary D from which pw was taken, the pseudorandom set is unknown to the attacker (the attacker can only learn this set via OPRF queries which require an online dictionary attack). Thus, any previous ability to run a pre-computation attack against the aPAKE protocol based on dictionary D is now lost. We assume that A always simultaneously sends queries (Compromise, sid ) and (StealPwdFile, sid ) for the same sid , resp. to FOPRF to FaPAKE , because in any instantiation of this scheme the server’s OPRF-related state and aPAKE- related state would be part of the same file[sid]. Consequently, for a single sid , S’s status (compromised or not) in FOPRF and FaPAKE is always the same. 16 Password Registration On input (StorePwdFile, sid , U, pw), S sends (Init, sid , pw) to FOPRF . On FOPRF ’s response (Init, sid , rw), S sends (StorePwdFile, sid , U, rw) to FaPAKE . Server Compromise Message (StealPwdFile, sid ) from A to functionality FsaPAKE is interpreted as if A sent two messages: (1) a (StealPwdFile, sid ) message to functionality FaPAKE and (2) a (Compromise, S) message to functionality FOPRF . Password Authentication and Key Generation 1. On input (UsrSession, sid , ssid , S, pw0 ), U sends (Eval, sid , ssid , S, pw0 ) to FOPRF . On FOPRF ’s response (Eval, sid , ssid , rw0 ), U sends (UsrSession, sid , ssid , S, rw0 ) to FaPAKE . 2. On input (SvrSession, sid , ssid ), S sends (SndrComplete, sid , ssid ) to FOPRF and (SvrSession, sid , ssid ) to FaPAKE . 3. On (sid , ssid , SK) or (abort, sid , ssid ) from FaPAKE , the recipient, either U or S, outputs this message. Fig. 4: Strong aPAKE Protocol in the (FOPRF , FaPAKE )-Hybrid World 4.1 Proof of Security Theorem 1. The protocol in Fig. 4 UC-realizes the FsaPAKE functionality assuming access to the OPRF functionality FOPRF and aPAKE functionality FaPAKE . Proof. For any adversary A, we construct a simulator SIM as in Fig. 5 and Fig. 6. Following [14], without loss of generality, we may assume that A is a “dummy” adversary that merely passes all its messages and computations to the environment Z. We omit all interactions with corrupted U and S where SIM acts as FOPRF and FaPAKE , since the simulation is trivial (SIM gains all information needed and simply follows the code of FOPRF /FaPAKE ). To keep notation brief we denote functionality FsaPAKE as F. We now show that the distinguishing advantage of Z between the real world and the simulated world is negligible. The argument uses a sequence of games, starting from the real world and ending at the simulated world; for any G ,G two adjacent games Gi and Gi+1 , let DistZ i i+1 denote the distinguishing advantage of Z between them, i.e., G ,Gi+1 DistZ i = | Pr[Z outputs 1 in Gi ] − Pr[Z outputs 1 in Gi+1 ]|. G ,Gi+1 (DistZ i is a function of the security parameter τ , but we omit τ below.) 17 ZO o / U/S O ZO o / U/S O  FsaPAKE O   Ao / FOPRF /FaPAKE   Ao / SIM (a) real world (b) simulated world Initialization (SIM assumes knowledge of server identity S and session identifier sid s.t. S initialized the SaPAKE instance via message StorePwdFile to FsaPAKE .) Set tx := 0 and tested := ∅ and send (Init, S, sid ) to A as a message from FOPRF . Stealing Password Data and Offline Queries 1. On (Compromise, sid ) from A aimed at FOPRF and (StealPwdFile, sid ) from A aimed at FaPAKE , send (StealPwdFile, sid ) to F. If F returns “no password file,” pass it to A as a message from FaPAKE . If F returns “password file stolen,” mark S and hfile, U, S, ·i compromised (record hfile, U, S, ⊥i and mark it compromised if there is no such record). Furthermore, if · is a string rw and rw ∈ tested, then send rw to A as a message from FaPAKE ; else send “password file stolen” to A as a message from FaPAKE . 2. On (OfflineEval, sid , S, x) from A aimed at FOPRF , if S is corrupted or marked compromised, send (OfflineEval, sid , FS (x)) to A as a message from FOPRF (pick FS (x) ←R {0, 1}` if it is undefined) and (OfflineTestPwd, sid , x) to F. If F returns “correct guess,” retrieve hfile, U, S, ·i (there must be such record, since if S is marked compromised, A must have sent (Compromise, sid ) aimed at FOPRF and (StealPwdFile, sid ) aimed at FaPAKE previously, and at that time hfile, U, S, ·i was recorded); if the last item is ⊥, replace it with FS (x). 3. On message (OfflineTestPwd, sid , rw∗ ) from A aimed at FaPAKE , add rw∗ to tested. If there is a record hfile, U, S, rwi marked compromised, do: – If rw = rw∗ , send “correct guess” to A as a message from FaPAKE . – Else send “wrong guess” to A as a message from FaPAKE . Fig. 5: The Simulator SIM in the Stealing Password Data Phase 18 Password Authentication 1. On (UsrSession, sid , ssid , U, S) from F, send (Eval, sid , ssid , U, S) to A as a message from FOPRF . Also, if this is the first UsrSession message for ssid , record hssid , U, Si and mark it fresh. 2. On (SvrSession, sid , ssid , U, S) from F, if there is no record hfile, U, S, ·i, record hfile, U, S, ⊥i and mark it uncompromised; regardless, send (SndrComplete, sid , ssid , S) and (SvrSession, sid , ssid , U, S) to A as messages from resp. FOPRF and FaPAKE . Also, if this is the first SvrSession message for ssid , set tx++, record hssid , S, Ui and mark it fresh. 3. On (RcvComplete, sid , ssid , i∗ ) from A aimed at FOPRF , retrieve hssid , U, Si; ignore this message if (i) there is no such record, or (ii) i∗ = S and tx = 0. Else augment the record to hssid , U, S, i∗ i and mark it fresh, send (UsrSession, sid , ssid , U, S) to A as a message from FaPAKE , and if i∗ = S then set tx−−. Active Session Attacks 1. On (TestPwd, sid , ssid , P, rw∗ ) from A aimed at FaPAKE , if there is a record hssid , U, S, i∗ i (if P = U) or hssid , S, Ui (if P = S) marked fresh, mark it stale and check if there is an x such that rw∗ = Fi∗ (x) (if P = U) or rw∗ = FS (x) (if P = S). – If there are more than one such x’s, output collision and abort. – If there is a unique such x, send (TestPwd, sid , ssid , P, x) to F and pass it to A as a message from FaPAKE . Also, if P = S and F returns “correct guess,” retrieve hfile, U, S, ·i, and if the last item is ⊥, replace it with rw∗ . – If there is no such x, send “wrong guess” to A as a message from FaPAKE . 2. On (Impersonate, sid , ssid ) from A aimed at FaPAKE , if there is a record hssid , U, S, i∗ i marked fresh, mark it stale and do: – If i∗ = S, send (Impersonate, sid , ssid ) to F, and pass F’s response (“correct guess” or “wrong guess”) to A as a message from FaPAKE . – Else send “wrong guess” to A as a message from FaPAKE . Key Generation and Authentication 1. On (NewKey, sid , ssid , P, SK ∗ ) or (TestAbort, sid , ssid , P) from A aimed at FaPAKE , if there is a record hssid , U, S, S∗ i (if P = U) or hssid , S, Ui (if P = S) not marked completed, pass the message from A to F. In the case of (TestAbort, sid , ssid , P), also pass F’s response (Succ or Fail) to A as a message from FaPAKE . Finally, mark the record above completed. Fig. 6: The Simulator SIM in the Login Phase 19 Games G0 and G1 : G0 is the real world. In G1 , on (UsrSession, sid , ssid , S, pw0 ) from Z to U and (RcvComplete, sid , ssid , i∗ ) from A to FOPRF , record hssid , U, S, i∗ , rw0 i (instead of hssid , U, S, rw0 i). Obviously, G0 ,G1 DistZ = 0. Game G2 : On (TestPwd, sid , ssid , P, rw∗ ) from A to FaPAKE , if there is a record hssid , U, S, i∗ , rw0 i (if P = U) or hssid , S, U, rwi (if P = S) marked fresh, check if there is an x such that rw∗ = Fi∗ (x) (if P = U) or rw∗ = FS (x) (if P = S). – If there are more than one such x’s, output collision and abort. – If there is a unique such x and x = pw0 (if P = U) or x = pw (if P = S), send “correct guess” to A as a message from FaPAKE . – In all other cases (i.e., x 6= pw0 /pw or there is no such x), send “wrong guess” to A as a message from FaPAKE . First consider event collision. collision occurs if and only if there are more than one x’s such that rw∗ = Fi∗ (x) (if P = U) or rw∗ = FS (x) (if P = S). This means that there are x1 6= x2 such that Fi∗ (x1 ) = Fi∗ (x2 ) or FS (x1 ) = FS (x2 ). Note that FS (·) and Fi∗ (·) are both random functions onto {0, 1}2τ . Assuming that A sends qF Eval and OfflineEval messages aimed at FOPRF in total and there are qU U sub-sessions, there are at most qF + qU + 1 F values defined in total (qF defined by A’s actions, qU defined by U’s input to the protocol, and 1 by S’s input to the protocol), so we have that (qF + qU + 1)2 Pr[collision] ≤ . 22τ +1 Next assume that collision does not occur. Consider the first message of type (TestPwd, sid , ssid , P, rw∗ ) (note that A receives a reply for the first such message only, since hssid , P, P0 , ·i becomes either compromised or interrupted after the first message). In both G1 and G2 , A receives “correct guess” if and only if rw∗ = Fi∗ (pw0 ) (if P = U) or rw∗ = FS (pw) (if P = S), so Z’s views in G1 and G2 in this case are identical. We have that (qF + qU + 1)2 DistG Z 1 ,G2 ≤ Pr[collision] ≤ , 22τ +1 which is a negligible function of τ . Game G3 : On (Impersonate, sid , ssid ) from A to FaPAKE , if there is a record hssid , U, S, i∗ , rw0 i marked fresh, send “correct guess” to A if S is marked compromised, i∗ = S and pw0 = pw; otherwise send “wrong guess.” Similar with above, A receives a reply for the first Impersonate message only, so we only consider the first such message. Note that in G2 A receives “correct guess” if and only if S is compromised and rw0 = rw, where rw0 = Fi∗ (pw0 ) and rw = FS (pw). Z’s views in G2 and G3 are identical unless (i∗ 6= 20 S ∨ pw0 6= pw) ∧ Fi∗ (pw0 ) = FS (pw) (in which case A receives “correct guess” in G2 and “wrong guess” in G3 ). Since i∗ 6= S or pw0 6= pw, Fi∗ (pw0 ) and FS (pw) are two independent random strings is {0, 1}2τ ; therefore, for a single U session, the probability that Fi∗ (pw0 ) = FS (pw) is at most 1/22τ . We have that G2 ,G3 qU DistZ ≤ , 22τ which is a negligible function of τ . Game G4 : After Z sends (StorePwdFile, sid , U, pw) to S, record hfile, U, S, ⊥i (instead of hfile, U, S, rw := FS (pw)i); replace ⊥ with rw := FS (pw) in the following two cases: (i) when A sends (OfflineEval, sid , S, pw) to FOPRF , and S is corrupted or marked compromised; (ii) when A sends (TestPwd, sid , ssid , S, rw∗ ) to FaPAKE , and rw∗ = rw. If neither (i) nor (ii) happens, rw = FS (pw) is a random string in {0, 1}2τ in Z’s view. Therefore, replacing rw with ⊥ in this case creates a 1/22τ distinguishing advantage. We have that G3 ,G4 1 DistZ ≤ , 22τ which is a negligible function of τ . Game G5 : Postpone the recording of hfile, U, S, ·i until (i) A sends (Compromise, sid ) to FOPRF and (StealPwdFile, sid ) to FaPAKE , or (ii) S sends (SvrSession, sid , ssid ) to FaPAKE . Note that if neither (i) nor (ii) happens, G4 does not retrieve hfile, U, S, ·i. Therefore, G4 ,G5 DistZ = 0. Note that in G5 , rw = FS (pw) and rw0 = Fi∗ (pw0 ) are defined no matter A queries them (i.e., A sends (OfflineEval, sid , S, pw) to FOPRF when S is corrupted or marked compromised; or A sends (Eval, sid , ssid , pw0 ) and then (RcvComplete, sid , ssid , A, i∗ ) to FOPRF ) or not. Game G6 : Leave rw (resp. rw0 ) undefined unless and until A queries FS (pw) (resp. Fi∗ (pw0 )). If A does not query FS (pw) (resp. Fi∗ (pw0 )), rw (resp. rw0 ) is a random string in {0, 1}2τ in Z’s view. Since there is 1 rw and qU rw0 ’s, we have that G5 ,G6 qU + 1 DistZ ≤ , 22τ which is a negligible function of τ . Game G7 : G7 is the simulated world. We can see that the change from G6 to G7 is merely conceptual, with the game challenger split into the SaPAKE functionality F and the simulator SIM. We have that G6 ,G7 DistZ = 0. 21 Summing up all results above, we conclude that Z’s distinguishing advantage between the real world and the simulated world is a negligible function of τ . This completes the proof. 5 A Compiler from AKE-KCI to Strong aPAKE via OPRF Our second transformation for building a Strong aPAKE protocol composes an OPRF with an Authenticated Key Exchange (AKE) protocol, “glued” together using authenticated encryption. We require the AKE to be secure in the UC model, namely, to realize the UC KE functionality of [18], and to also be “KCI secure.” The latter notion was defined in [38] under a game-based formulation and formalized in Section 5.1 below in the UC setting. 5.1 UC Definition of AKE-KCI The notion of KCI (for “key-compromise impersonation”) security for KE protocols, concerns an attacker A who learns party P’s long-term keys but otherwise does not actively control P. Resistance to KCI attacks, or “KCI security” for short, postulates that even though A can impersonate P to other parties, sessions which P itself runs with honest peers are unaffected and remain secure. A game-based definition of KCI security appears in [38], and here we formalize it in the UC model through functionality FAKE−KCI , shown in Fig. 7.7 Functionality FAKE−KCI extends the standard KE functionality of [18] with two adversarial actions. The first, Compromise, when issued at a party P models the leakage of P’s secret keys to the attacker. In contrast to the case where P is corrupted, a compromised P is not controlled by the attacker A but A can actively impersonate P in sessions with P0 by virtue of knowing P’s keys. The second action, Impersonate, represents this ability of the attacker: it is directed at sessions hssid , P0 , Pi where P is compromised and the result is that A gets to choose the session key via the NewKey action. Note that if P is compromised but not corrupted then sessions hssid , P, P0 i with an uncompromised and uncorrupted P0 are not considered compromised. This captures the resistance to KCI attacks. All other elements in FAKE−KCI are the same as in the basic UC KE functionality, except of syntactic adjustments to the user-server setting. Instantiations of AKE-KCI secure protocols. For concreteness we consider two examples of key-exchange protocols that can satisfy AKE-KCI security as defined above. While we do not include explicit proofs for these protocols here, we argue their security based on known results. The first 7 The UC KCI-AKE functionality in Fig. 7 revises the UC KCI-AKE functionality which appeared in [33] by handling adaptive compromise of either party, and not only the server. The “two-sided adaptive” security of AKE is indeed needed in SaPAKE protocol shown in Section 5.2, as we explain in Section 5.3. 22 In the description below, we assume P, P0 ∈ {U, S}. – On (UsrSession, sid , ssid , S) from U, send (UsrSession, sid , ssid , U, S) to A∗ . If ssid was not used before by U, record hssid , U, Si and mark it fresh. – On (SvrSession, sid , ssid , U) from S, send (SvrSession, sid , ssid , U, S) to A∗ . If ssid was not used before by S, record hssid , S, Ui and mark it fresh. – On (Compromise, sid , P) from A∗ , mark P compromised. – On (Impersonate, sid , ssid , P) from A∗ , if P is marked compromised and there is a record hssid , P0 , Pi marked fresh, mark this record compromised. – On (NewKey, sid , ssid , P, SK ∗ ) from A∗ where |SK ∗ | = τ , if there is a record hssid , P, P0 i not marked completed, do: • If the record is compromised, or P or P0 is corrupted, set SK := SK ∗ . • If the record is marked fresh, a (sid , ssid , SK 0 ) tuple was sent to P0 , and at that time record hssid , P0 , Pi was marked fresh, set SK := SK 0 . • Else pick SK ←R {0, 1}τ . Finally, mark hssid , P, P0 i completed and send (sid , ssid , SK) to P. Fig. 7: Adaptively Secure Functionality FAKE−KCI protocol we consider is the SIGMA protocol from [37] which is the basis for the key exchange protocols behind TLS 1.3 and IKEv2. SIGMA has been proven secure in [17] in the UC AKE formalization of [18]. While this formalization does not include the KCI property, KCI can easily be argued for SIGMA based on the unforgeability of the signature scheme. A second example is the HMQV protocol from [38] whose KCI property was proved in [38] in the game-based Canetti-Krawczyk model [16] extended to include KCI security. Here we require UC security, namely, a protocol that realizes functionality FAKE−KCI . Fortunately, [18] proves the equivalence of the game-based definition of [16] and their UC AKE formulation (in particular, this equivalence applies to the three-message HMQV with explicit authentication). While this UC formulation does not include KCI security, the equivalence with the game-based definition extends to the KCI case. Indeed, since the original equivalence from [18] holds even in the case of adaptive party corruptions, the Compromise and Impersonate actions introduced here – which constitute a limited form of adaptive corruptions – follow as a special case. Finally, we note that the equivalence between the above models also preserves forward secrecy, so this property (proved in the game-based Canetti-Krawczyk model in [38]) holds in the UC too. The security of HMQV (without including security against the leakage of ephemeral exponents) is based on the CDH assumption in the RO model [38]. 23 Public Components: – KCI-secure AKE protocol Π with private/public keys denoted ps , Ps , pu , Pu ; – Random-key robust and equivocable authenticated encryption scheme AE = (AuthEnc, AuthDec) with (2τ )-bit keys; – Functionality FOPRF with output length parameter ` = 2τ . Password Registration 1. On input (StorePwdFile, sid , U, pw), S generates public key pairs (ps , Ps ) and (pu , Pu ), and sends (Init, sid ) and (OfflineEval, sid , S, pw) to FOPRF . 2. On FOPRF ’s response (OfflineEval, sid , rw), compute c ← AuthEncrw (pu , Pu , Ps ) and record file[sid ] := (ps , Ps , Pu , c). Server Compromise – On (StealPwdFile, sid ) from A, S retrieves file[sid ] and sends it to A. Password Authentication and Key Generation 1. On (UsrSession, sid , ssid , S, pw0 ), U sends (Eval, sid , ssid , S, pw0 ) to FOPRF and records FOPRF ’s response (Prefix, ssid , prfx). 2. On (SvrSession, sid , ssid ), S parses file[sid ] = (ps , Ps , Pu , c), sends (ssid , c) to U and (SndrComplete, sid , ssid ) to FOPRF . On FOPRF ’s response (Prefix, ssid , prfx0 ), S runs Π on (ps , Ps , Pu ) and ssid Π := [ssid ||prfx0 ]. 3. On (Eval, sid , ssid , rw0 ) from FOPRF and c from S, U decrypts m := AuthDecrw0 (c). If m can be parsed as (p0u , Pu0 , Ps0 ) then U retrieves (Prefix, ssid , prfx) and runs Π on input (p0u , Pu0 , Ps0 ) and ssid Π := [ssid ||prfx]; Else U outputs (abort, sid , ssid ) and halts. 4. Given Π’s local output SK, each party, either U or S, outputs (sid , ssid , SK). Fig. 8: OPRF-AKE: Strong aPAKE based on AKE-KCI and OPRF 5.2 Strong aPAKE Construction from OPRF and AKE-KCI In Fig. 8 we present the Strong aPAKE protocol we call OPRF-AKE, because it is based on OPRF and AKE-KCI, “glued” with the equivocable robust symmetric encryption. The protocol uses the same OPRF tool as the Strong aPAKE construction of Section 4, for length parameter ` = 2τ , which defines the “randomized password” value rw = Fk (pw) for user U’s password pw and OPRF key k held by server S. We assume that in the AKE-KCI protocol Π each party holds a (private,public) key pair, and that the each party runs the Login subprotocol using its key pair and the public key of the counterparty as inputs. In Password Registration phase, server S generates the user U’s keys, and S’s password file contains S’s key pair ps , Ps ; U’s public key Pu ; and a ciphertext c of U’s private key pu , and the public keys Pu and Ps created using an Authenticated Encryption scheme using rw = Fk (pw) as the key. After 24 creating the password file, value pu is erased at S. In Login phase, S runs OPRF with U, which lets U compute rw = Fk (pw), it sends c to U, who can decrypt it under rw and retrieves its key-pair pu , Pu together with the server’s key Ps , at which point both parties have appropriate inputs to the AKE-KCI protocol Π to compute the session key. Role of Authenticated Encryption. Protocol OPRF-AKE utilizes an authenticated encryption scheme AE = (AuthEnc, AuthDec) to encrypt and authenticate U’s AKE “credential” m = (pu , Pu , Ps ). We encrypt the whole payload m for simplicity; in fact, unlike U’s private key pu , values Pu , Ps could be public and need to be only authenticated, not encrypted. However, the authentication property of AE must apply to the whole payload. Intuitively, U must authenticate S’s public key Ps , but if U derived even its key pair (pu , Pu ) using just the secrecy of rw = Fk (pw), e.g., using rw as randomness in a key generation, and U then executed AKE on such (pu , Pu ) pair, the resulting protocol would already be insecure. To see an example, if an AKE leaks U’s public key input Pu (note that AKE does not guarantee privacy of the public key) then an adversary A who engages U in a single protocol instance can find U’s password pw via an offline dictionary attack by running the OPRF with U on some key k ∗ , and then given Pu leaked in the subsequent AKE it finds pw such that the key generation outputs Pu as a public key on randomness rw = Fk∗ (pw). Thus the role of the authentication property in authenticated encryption is to commit A to a single guess of rw and consequently, given the OPRF key k ∗ , to a single guess pw. (Note that our UC OPRF notion implies that F is collision-resistant.) To that end we need the authenticated encryption to satisfy the following property which we call random-key robustness:8 For any efficient algorithm A, AdvRBST,AE A = Pr [c ← A(k1 , k2 ) s.t. AuthDeck1 (c) 6=⊥, AuthDeck2 (c) 6=⊥] k1 ,k2 ←R {0,1}τ is a negligible function of τ . In other words, it must be infeasible to create an authenticated ciphertext that successfully decrypts under two different randomly generated keys. This property can be achieved in the standard model using e.g., encrypt-then-MAC with a MAC that is collision resistant with respect to the message and key, a property enjoyed by HMAC with full hash output. In the RO model used by our aPAKE application one can also enforce it for any authenticated encryption scheme by attaching to its ciphertext c a hash H(k, c) for a RO hash H with 2τ -bit outputs. Encryption Equivocability. In the security argument we also need the authenticated encryption scheme to be equivocable in the following sense: In the scenario where the adversary gets a ciphertext followed by the key, there is 8 This notion is a weakening of full robustness (FROB) from [23] where the attacker is allowed to choose k1 , k2 (in our case these keys are random). An even weaker notion, Semi-FROB, is defined in [23] where k1 , k2 are random but only k1 is provided to A. 25 a simulator who first creates the ciphertext with no information about the plaintext, and then creates the key given the plaintext. Formally, an authenticated encryption scheme AE is equivocable if for any efficient algorithm A, there is an efficient stateful simulator SIMEQV such that the distinguishing advantage of A’s views in the following two games, AdvEQV,AE R , is a negligible function of τ : – The real game: A sends out message m, and computes its final output given (c, k) produced as k ←R {0, 1}τ and c ← AuthEnck (m). – The ideal game: A sends out message m, and computes its final output given (c, k) produced as c ← SIMEQV (|m|) and k ← SIMEQV (m). (Since SIMEQV is stateful, we assume that when computing c ← SIMEQV (|m|) it creates an internal state which it then utilizes to compute k ← SIMEQV (m).) Common encryption modes are equivocable under some idealized assumption. For example, an encryption scheme that xor’s the message m with a pad generated by a pseudorandom generator G is equivocable if we model G as a random oracle. In this case, in response to an AuthEnc query, SIMEQV will choose the ciphertext c at random; then, to respond to a corresponding Reveal query with message m, SIMEQV programs G’s output to c ⊕ m. If the PRG is implemented in counter mode using a block cipher a similar strategy works but in the ideal cipher model. The above can be extended to Authenticated Encryption modes. For example, if a MAC is computed on the ciphertext, SIMEQV responds to an AuthEnc query by choosing a random MAC key k and outputting (c, mack (c)) where c is chosen as above. To output the MAC key upon a Reveal query with message m, SIMEQV outputs k (which is independent of the message hence it works with any message). Note on not utilizing FAKE−KCI . In Fig. 8 we abstract the OPRF protocol as functionality FOPRF , but we use the real-world AKE-KCI protocol Π, rather than functionality FAKE−KCI . The reason for this presentation is that in the KE functionality of [18], of which FAKE−KCI is an extension, it is not clear how to support a usage of the KE protocol on keys which are computed via some other mechanism than the intended KE key generation. The KE functionality of [18] assumes that each entity keeps its private key as a permanent state, authenticates to a counterparty given its identity, and a KE party cannot specify any bitstring as one’s own private key and a counterparty’s public key. This is not how we use AKE in protocol OPRF-AKE in Fig. 8 precisely because U does not keep state and has to reconstruct its keys from a password (via OPRF). However, we can still use the real-world protocol Π, which UC-realizes FAKE−KCI , giving it the OPRF-computed information as input. In the proof of security we utilize the simulator SIMAKE , which shows that Π UC-realizes FAKE−KCI , in our simulator construction, but we rely on its correctness only if U runs Π on the correctly reconstructed (pu , Ps , Ps ), and if the adversary causes U to reconstruct a different string we interpret this as a successful attack on U’s login session. Role of OPRF Transcript Prefixes. As discussed in Section 3, the UC OPRF functionality FOPRF used here extends the corresponding functionality of the 26 proceedings version of this paper [33] by having OPRF parties U and S output an OPRF transcript prefix, whose implications are that for every S session the adversary must follow one of the following paths: (I) If the transcript prefix output by some S session matches the transcript output by some U session, then this S session can only be used to let U compute Fk (pw0 ) on U’s input pw0 ; or (II) If the transcript prefix output by S does not match any prefix output by some U, then this S session can only be used by the adversary, and in particular it cannot be “connected” to some honest U session. This property helps in the security proof of the SaPAKE protocol of Fig. 8. The difference between that protocol and its variant shown in the proceedings version of this paper [33], is that when U and S complete their OPRF interaction, each party outputs an OPRF transcript prefix prfx, and when each party starts its end of the AKE (sub)protocol, it runs the AKE with (sub)session ID’s computed as ssid Π := [ssid ||prfx], where ssid is the identifier of a particular instance of the SaPAKE login protocol. The security property of AKE, see Fig. 7, implies that U and S cannot establish a shared session key unless they run on the same (sub-)session ID’s. With sub-session ID’s defined as above, this implies that they cannot establish a shared key unless their OPRF transcript prefixes match. This can be useful in a security proof because for each server S session the adversary has to decide if this S session can be “connected” to some honest U, because its OPRF transcript prefixes match (case I), or this S session is actively attacked, and cannot be matched with any U session, because its OPRF transcript prefix does not match that of any U session (case II). This can help in the security argument, because it lets the simulator decide which S session is (potentially) actively attacked (case II), and which one is not (case I). Here is why the simulator needs to be able to make this decision, and why without this “no active attack if transcript prefixes match” property the security argument does not quite work. (Indeed, this problem appears in the variant of this protocol which was in the proceedings version of this paper [33], and this is why we modify this protocol as described above.) Consider an adversary A playing a man-in-the-middle attack between n simultaneous U sessions and n S sessions. Consider an OPRF protocol which, like the DH-OPRF scheme of Fig. 13 in Appendix B, is malleable, i.e. A can blind U’s messages to S and de- blind S’s responses, and even though U-A and A-S communications cannot be externally linked, A actually lets U compute the correct outputs in the OPRF protocol with S. In the case of protocol DH-OPRF, the adversary A can do this by replacing U’s message a with message a0 = (a)t to S, for t ←R Zq , and then replacing S’s reply b0 = (a0 )k with reply b = (b0 )1/t to U. Note that b = ak so U’s interaction with A is just as good as an interaction with S, but the U-A interaction cannot be linked, by an external observer, to any unique S session. This creates a problem in the simulation because if the simulator observes that A locally computes Fk (pw∗ ) for some pw∗ , e.g. by observing A’s hash function queries, the simulator needs to make (TestPwd, ..., S, pw∗ ) query against some session of S, to program the Fk (pw∗ ) output depending on 27 whether pw∗ = pw or not. However, once the simulator sends TestPwd to S session, that session becomes interrupted (assume pw∗ 6= pw), and cannot be then connected, i.e. output the same session key, with any U session. If the A-S and U-A interactions are unlinkable then the simulator can only choose S session for TestPwd query at random. If A makes half of S session connect to half of U sessions, while using the other half for local computation of Fk (·) on n/2 password tests, the simulator needs to make too many guesses to be successful with non-negligible probability. Fortunately, using (the hash of) the OPRF transcript prefix (defined as the value a in this case) in the AKE protocol solves this problem because now to make any U and S session connect in the AKE instance the adversary has to make a0 = a, which the simulator can detect. Since the modified OPRF security property says that such prefix-matched OPRF sessions cannot be used by A to compute Fk (·), if the simulator detects such computation by A, it can be tested against any S session whose prefix does not match message a sent by any user U. Role of Strong aPAKE Model Relaxations. As discussed in Section 2, our notion of Strong aPAKE functionality FsaPAKE , Fig. 2, is a relaxation of a variant of this notion presented in the proceedings version of this paper [33], which we denote here as FsaPAKE+ , Fig. 1. We introduced these relaxations because they model the adversarial behavior that is possible in protocol OPRF-AKE, and although in Section 2 we explain that these relaxations are mild and should be insignificant in practice, they are nevertheless necessary, and in particular, protocol OPRF-AKE cannot be shown to realize the stronger functionality FsaPAKE+ . Recall from Section 2 that there are two relaxations which are introduced in functionality FsaPAKE : (1) An attacker who tests the correct password guess on some server S session can compromise all other open, i.e. not completed, S sessions, even if the adversary already effectively tested other passwords on these sessions; (2) An attacker must commit to a tested password for each S session, but that tested password might be efficiently extractable, by the simulator, only after the attacked session completes (although the attacker in that case learns only if its password guess was correct or not, and learns no information about the session key SK which that S session outputted, even if its password guess was correct). The need for both relaxations with regards to protocol OPRF-AKE stem from two features of our notion of UC OPRF. The first one is a lack of a binding between the server-side and client-side view of OPRF instances. Note that instances of server S OPRF evaluation are indexed by (sub)session ID’s ssid s , and each of them allows the FOPRF -hybrid world adversary A to compute function Fk on one argument x, as modeled by a sequence of queries (Eval, sid , ssid ∗ , S, x) and (RcvComplete, sid , ssid ∗ , A, S) from A. However, the OPRF functionality does not enforce any correspondence between the server-side (sub)session ID ssid s of S and the client-side “session” pointer ssid ∗ used by A. Therefore if the environment opens n S sessions simultaneously (note that all of them operate on the same password pw), then when A 28 evaluates Fk (pw∗ ) through Eval+RcvComplete commands above this constitutes a password test (note that if pw∗ = pw then rw = Fk (pw∗ ) can decrypt user-credentials (pu , Pu , Ps ) from the authenticated ciphertext c), but each such test is in effect a test against all open S sessions, because if n OPRF-AKE S instances trigger n OPRF S instances then A can test n passwords pw∗1 , . . . , pw∗n , and if any of these tests is correct then A can use the decrypted credentials (pu , Pu , Ps ) as inputs to AKE subprotocols of all of these n S sessions, and learn/determine the sessions keys SK1∗ , ..., SK ∗ n on all of them. Note that this behavior is not allowed in the standard notion of UC (Sa)PAKE: There each of the n simultaneously open S sessions are treated separately, and the adversary can determine/learn only SKi∗ for pw∗i = pw, whereas the other secret keys, on sessions corresponding to the n − 1 incorrect password guessess pw∗j 6= pw, would remain secure. As described in Section 2, the relaxed functionality FsaPAKE allows A to compromise all incomplete sessions if one of them is compromised, and the OPRF-AKE simulator SIM utilizes this relaxation by sending (TestPwd, sid , ssid 0 , S, x) to FsaPAKE , given A’s queries (Eval, ..., x) and RcvComplete to FOPRF , for ssid 0 corresponding to any S session which is not completed. (See step (4c) in SIM in Fig. 10.) Note also that the above disconnection between the server-side and the client-side OPRF evaluation is not only a feature of our abstract OPRF functionality FOPRF , but also of protocol DH-OPRF, see Appendix B, which implements this OPRF notion. (Indeed, this is implied by the fact that protocol DH-OPRF realizes FOPRF .) Server-side instances of DH-OPRF are formed by an in-coming message a, a random group element, and the server response b = ak , while the client evaluation of Fk (x) is constitued by a hash function query H2 (x, v) for v = (H1 (x))k , thus there is no link between the client-side evaluation of (H1 (x))k for any particular argument x, with any particular instance of server S OPRF evaluation. The second relaxation reflects the fact that A can evaluate Fk on a password guess pw∗ after it interacts with S in the AKE’s sub-protocol, at which point S session completes. If none of the on-line password tests succeeded so far this AKE instance is secure, and hence is this overall OPRF-AKE instance, because A doesn’t know the user’s authentication keys (pu , Pu , Ps ). However, if A evaluates Fk (pw∗ ) after this session ends it still learns if pw∗ = pw, because A can test if Fk (pw∗ ) correctly decrypts ciphertext c. In the standard UC (Sa)PAKE functionality this would not be simulatable, and thus we relax FsaPAKE so it allows such “delayed” password test. The OPRF-AKE simulator SIM relies on this relaxation by sending Interrupt on all actively attacked S sessions in step (2a) in Fig. 10, and by reacting to (Eval, . . . , x) message in step (4c) by doing (TestPwd, . . . , x) on any non-completed S session if any exist, and if they are not then doing a “postponed” password test on some completed (and previously untested) S session in step (4(c)i). Role of Adaptive Security of AKE and of Equivocable Encryption. Let us also explain why we need the AKE protocol Π to allow adaptive compromise 29 for both parties, and why we need the authenticated encryption scheme AE to be equivocable. Assume for now that the two parties’ passwords match, i.e., pw0 = pw. Recall that an adversary A may compromise S at any time and perform an offline dictionary attack, and that A can perform an on-line dictionary attacks by running the authentication protocol protocol with different password guesses. Observe the following facts: – If A does not compromise S, then Z learns no information about ps . On the other hand, if A compromises S, then Z learns ps as part of file[sid ]. This is equivalent to S being compromised in Π. – If A does not run an either online or offline dictionary attack then rw = FS (pw) is a random string in Z’s view, so by security of AE, Z learns no information about pu . On the other hand, if A succeeds in either online or offline dictionary attack and learns rw = FS (pw), then Z learns pu by parsing m = AuthDecrw (c). This is equivalent to U being compromised in Π. Note that A may compromise S and/or compute FS (pw) at any time. Hence, we need the AKE protocol Π to tolerate adaptive corruptions of either party. This also shows why we need AE to be equivocable. Consider an adversary A who first sees c (in a message from S to U) and then learns rw (via an offline attack). The security of Π relies on the condition that pu is kept random in A’s view until U is compromised (which, as argued above, is equivalent to A learning rw). However, A first sees the ciphertext c, and then learns the decryption key rw, so two things must hold about encryption AE: (1) Ciphertext c = AE(rw, m) must hide all information about the encrypted plaintext m until rw is revealed; (2) The adversary might at some point learn the decryption key rw s.t. c = AE(rw, m). Standard encryption security does not guarantee the security in this adaptive setting, and in particular the standard simulation strategy of replacing c with an encryption of an unrelated value m0 would fail requirement (2) above. The equivocability of AE is exactly what is needed here, because it implies an efficient simulator which can create ciphertext c given only the length of the encrypted plaintext (hence until rw is revealed c leaks nothing about pu ), and then given any plaintext m, e.g. m = (pu , Pu , Ps ), it can produce rw s.t. c = AuthEnc(rw, m). 5.3 Proof of Security We state and prove the Strong aPAKE security of the generic composition protocol OPRF-AKE from Fig. 8 in the following theorem: Theorem 2. If protocol Π UC-realizes functionality FAKE−KCI , and AE is a random-key robust and equivocable authenticated encryption scheme, then the protocol in Fig. 8 UC-realizes functionality FsaPAKE in the FOPRF -hybrid model. Overview of Simulation Strategy. We start with a high-level description of the simulation strategy, whereas the detailed simulation algorithm SIM is contained in Fig. 9, 10, and 11, each figure dealing with the different aspect of 30 protocol OPRF-AKE and hence the simulation as well. Recall that the security proof must construct a simulator SIM which interacts as an ideal-world adversary with functionality FsaPAKE , and creates an indistinguishable view to the environment as that of a real-world adversary A interacting with the honest parties in protocol OPRF-AKE. Consider first the simulation of the password file storage, i.e., the offline security of OPRF-AKE. The actions of simulator SIM regarding this phase are described in Fig. 9, but the idea is that SIM generates a virtual FOPRF instance FS (instead of computing rw := FS (pw)) and generates c using simulator SIMEQV assumed by the equivocability of encryption AE. If server S later becomes compromised, A learns file[sid ] (which SIM obtains by sending (Compromise, sid , S) to SIMAKE ) and also gains offline evaluation access to the FOPRF instance simulated by SIM. At this point A can stage an offline dictionary attack on the password file, by sending (OfflineEval, sid , S, x) queries aimed at FOPRF . The simulator SIM services each such query by sending (OfflineTestPwd, sid , x) to FsaPAKE . If FsaPAKE replies “wrong guess” then SIM replies to A with FS (x) ←R {0, 1}` . If, however, FsaPAKE replies “correct guess” then SIM learns that A’s password guess x is equal to the password pw for which this FsaPAKE instance was initialized, and in this case SIM lets SIMEQV reveal rw s.t. c = AuthEnc(rw, (pu , Pu , Ps )) and “programs” value FS (x) on x = pw to rw.9 By the “random even to the key holder” property of UC OPRF, the adversary’s view of FS outputs set as above is identical to those in the real protocol. Regarding the login phase, i.e., the online security of OPRF-AKE, let us first fix some notation. We use i∗ to denote the function pointer used by A in (RcvComplete, sid , ssid , U, i∗ ) for U’s OPRF session, and c∗ to denote the ciphertext in the message which A passes to U after OPRF evaluation. As in functionality FsaPAKE , Fig. 2, we use pw to denote S’s password, and pw0 to denote U’s password. The details of the simulation procedure regarding the online phase are divided between Fig. 10, where we show how SIM reacts to A’s messages to the OPRF functionality FOPRF (recall that A is an adversary in the FOPRF -hybrid world), and Fig. 11, where we show how SIM reacts to A’s messages related to AKE protocol Π. However, the main ideas of the simulation can be explained by considering the two cases of the man-in-the-middle adversary, who can (1) emulate the server to honest user U instances and (2) emulate the user to honest server S instances, and below we overview how SIM handles each case. When A plays as the server to some user instance (sid , ssid , U), the key observation is that U outputs (abort, sid , ssid ) with overwhelming probability, except for either of the following two cases: – Case (∗) corresponds to line (1a) in AKE Simulation part of Simulator SIM in Fig. 11, and it involves the following two conditions: (i) A is passive until the execution of Π begins, i.e., i∗ = S and c∗ = c, and (ii) the two parties’ passwords match, i.e., pw0 = pw. (Note that SIM can test if pw0 = pw via a 9 Note that after S compromise, A could also guess the correct password pw in an online attack, but as we argue below SIM we can handle that in a similar way. 31 TestAbort message.) In this case U’s input is the “correct” (pu , Pu , Ps ), so SIM can outsource the simulation of AKE protocol Π on behalf of this U’s instance, denoted Πu , to SIMAKE . – Case (∗∗) corresponds to line (1(b)iii) in AKE Simulation part of Simulator SIM in Fig. 11, involves the condition that A computes rw0 = Fi∗ (pw0 ), which can happen (i) in an online OPRF instance between A and S if i∗ = S, or (ii) via offline computation (OfflineEval, sid , i∗ , pw0 ) if i∗ = S and S is compromised or corrupted, or (iii) via offline computation (OfflineEval, sid , i∗ , pw0 ) if i∗ 6= S. In all of these cases Z can choose (p∗u , Pu∗ , Ps∗ ) and set the ciphertext c∗ as c∗ ← AuthEncrw0 (p∗u , Pu∗ , Ps∗ ). However, SIM, who sees A’s OPRF queries, will learn the same information as well, so it can simulate U’s behavior by running Πu on the same inputs as the real-world U would use. Here the random-key robustness of AE is needed, because it guarantees that c∗ decrypts to m 6= ⊥ for at most one key (with overwhelming probability), which allows SIM to determine (i∗ , pw0 ) such that c∗ is a valid encryption under key rw0 = Fi∗ (pw0 ). When A plays as the user to some server instance (sid , ssid , S), the input of S is always the “correct” (pu , Pu , Ps ). Therefore, SIM can outsource the simulation of AKE protocol Π on behalf of this S’s instance, denoted Πs , to SIMAKE . Note that user’s AKE authentication token pu is hidden to A unless and until A computes rw = FS (pw), via either guessing pw and then evaluating OPRF on pw, or by compromising S and then running an offline dictionary attack. In both cases SIM can extract pw by observing A’s interaction with FOPRF , send (TestPwd, sid , ssid , S, pw) to FsaPAKE and learn that pw is S’s password, and then send (Compromise, sid , ssid , U) to SIMAKE . Simulation Components. Since protocol OPRF-AKE relies on the UC security of two components, OPRF and AKE, we briefly describe how the real world and the ideal world interactions involve the protocols, functionalities, or simulators of these components. We describe these two scenarios schematically in a diagram below, where for simplicity we call the FOPRF -hybrid world execution the “real world”. However, since it is an execution in the FOPRF -hybrid world, it involves an interaction between A and U/S via functionality FOPRF , while the direct interaction between A and U/S pertains to the other two components of OPRF-AKE, namely the AE ciphertext c, and the AKE protocol Π. Regarding the ideal world execution, recall that protocol Π realizes functionality FAKE−KCI , hence there is a simulator SIMAKE which can be used to simulate Π’s execution. Indeed, our simulator SIM uses simulator SIMAKE as a sub-routine, essentially “outsourcing” to SIMAKE the interactions of A with (1) all Π executions run by S instances, and (2) those Π executions run by U instances which fall into case (∗) above, i.e. where U runs on its intended inputs (pu , Pu , Ps ). On these Π instances SIM passes all the messages between A and SIMAKE , which we denote in part (b) of the diagram below as the direct link between A and SIMAKE . All the other aspects of the simulation, namely (1) the AuthEnc ciphertext c, (2) the emulation of the OPRF functionality FOPRF , and (3) Π executions of U instances which fall into case 32 (∗∗), will be handled directly by SIM. Finally, note that simulator SIMAKE , while interacting with A, expects to communicate with the ideal AKE functionality FAKE−KCI . Simulator SIM will therefore internally emulate two functionalities: The OPRF functionality FOPRF , used for interactions with A, and the AKE functionality FAKE−KCI , used for interactions with SIMAKE . ZO o / U/S = O ZO o / Ū/S̄ O c+Π  FsaPAKE O  }  Ao / FOPRF   A og / SIM c+FOPRF +Π(∗∗) O Π FAKE−KCI '  (a) real world SIMAKE (b) simulated world Proof by Game Changes. As is standard in UC security proofs, we assume w.l.o.g. that A is a “dummy” adversary who merely passes through all its messages to and from Z. To keep notation brief we denote functionality FsaPAKE as F, and we use Πu and Πs for, respectively, U’s and S’s algorithm in the execution of protocol Π. We use S and sid to denote the unique server entity and session identifier such that S initialized this SaPAKE instance via command (StorePwdFile, sid , ...) to FsaPAKE , and we assume that SIM knows both identifiers. Let AdvEQV,AE R , AdvAUTH,AE R , and AdvRBST,AE R denote an advantage of algorithm R in, respectively, the equivocability game, authenticity game, and random-key robustness game of authenticated encryption scheme AE. Fix an efficient environment Z, and a dummy adversary A, and let qF be the number of Eval and OfflineEval messages sent by A or Ū given this Z, and let qU be the number of U SaPAKE instances, i.e. the number of UsrSession queries sent to all user U entities by Z. We will show that the advantage of Z in distinguishing between the real world and the simulated world executions is negligible, and we do so using a sequence of games, starting from the real world and ending at the simulated world. For any two adjacent games we use G ,G DistZ i i+1 to denote the advantage of Z in distinguishing Gi and Gi+1 . Game G0 : G0 is the real world. Note that in G0 , an instance of U and sub-session ID ssid results in U outputting (abort, sid , ssid ) if and only if it receives (ssid , c∗ ), A specifies 33 Initialization Set tx := 0. Initialize SIMAKE and function family {Fi } s.t. for all (i, x), including i = S, Fi (x) is undefined. Whenever SIM references undefined value Fi (x) below, set Fi (x) ←R {0, 1}` . Set c ← SIMEQV (kpr + 2kpb ), where kpr , kpb are the lengths of private/public AKE keys, and record file[sid ] := (⊥, ⊥, ⊥, c). Stealing Password Data and Offline Queries 1. On (Compromise, sid ) aimed at FOPRF and (StealPwdFile, sid ) aimed at S from A (we assume A sends these commands together), send (StealPwdFile, sid ) to F. If F returns “no password file,” pass this message to A on behalf of S. Otherwise declare S compromised, send (Compromise, sid , S) to SIMAKE , and on response (ps , Ps , Pu ) reset file[sid ] := (ps , Ps , Pu , c) and send it to A on behalf of S. 2. On (OfflineEval, sid , i∗ , x) from A aimed at FOPRF , do the following: – If i∗ = S and S is not compromised, ignore this message. – If i∗ = S and S is compromised, send (OfflineTestPwd, sid , x) to F. If F returns “correct guess,” send (Compromise, sid , U) to SIMAKE , and on response (pu , Pu , Ps ) set rw ← SIMEQV (pu , Pu , Ps ) and FS (x) := rw, record hfile, U, S, xi, and declare U compromised. Finally, send (OfflineEval, sid , Fi∗ (x)) to A on behalf of FOPRF . Fig. 9: Simulator SIM for SaPAKE of Fig. 8: Initialization and Offline Attack index i∗ in the (RcvComplete, sid , ssid , U, i∗ ) message aimed at FOPRF , and AuthDecrw0 (c∗ ) = ⊥ (where rw0 = Fi∗ (pw0 )). In the following five games (G1 – G5 ) we gradually change this condition. Game G1 (user aborts if A is passive before Π starts but passwords do not match): In the case that (c∗ = c ∧ i∗ = S) and pw0 6= pw, U outputs (abort, sid , ssid ). Z’s views in G0 and G1 are identical unless on some U sub-session event c∗ = c ∧ i∗ = S ∧ pw0 6= pw occurs but AuthDecrw0 (c) 6= ⊥: In this case U outputs (sid , ssid , SK) in G0 and (abort, sid , ssid ) in G1 ). Since c ← AuthEncrw (pu , Pu , Ps ), we have that AuthDecrw (c) 6= ⊥. But rw0 and rw are independent random strings in {0, 1}2τ ; therefore, we can construct a reduction RRBST1 to the random-key robustness of AE where rw0 and rw are the challenge AE keys: RRBST1 runs the code of G0 except that it uses its input as rw0 and rw. In every sub-session RRBST1 checks if AuthDecrw (c) 6= ⊥ and AuthDecrw0 (c) 6= ⊥, and if so, it outputs c (and breaks the game). We have that G0 ,G1 DistZ ≤ AdvRBST,AE RRBST1 , which is a negligible function of τ . Game G2 (abort the entire game if c∗ is valid under two different keys): In the case that ¬(c∗ = c ∧ i∗ = S), the game outputs halt and aborts 34 OPRF Evaluation 1. On (UsrSession, sid , ssid , U, S) from F, send (Eval, sid , ssid , U, S) to A on behalf of FOPRF . On prfx from A, record hssid , U, prfxi if prfx is new, else reject. 2. On (SvrSession, sid , ssid 0 , U, S) from F, retrieve file[sid ] = (·, ·, ·, c), send (SndrComplete, sid , ssid 0 , S) and c to A on behalf of, respectively FOPRF and S, and given A’s response prfx0 do the following in order: (a) If there is record hssid , U, prfx0 i then replace it with hssid , U, OKi; Else record hssid 0 , acti, set tx++, send (Interrupt, sid , ssid 0 , S) to F. (b) Record hssid Π , ssid 0 , S, Ui and mark it fresh for ssid Π := ssid 0 |prfx0 , and send (SvrSession, sid , ssid Π , U, S) to SIMAKE . 3. On (RcvComplete, sid , ssid , U, i∗ ) from A aimed at FOPRF , retrieve hssid , U, prfxi (ignore the message if such record not found) and do in order: (a) If i∗ = S, S is not compromised, and there is no record hssid , U, OKi, then do: Ignore this message if tx = 0, else set tx−−. (b) Augment record hssid , U, prfxi to hssid , U, prfx, i∗ i. 4. On (Eval, sid , ssid , S, x) followed by (RcvComplete, sid , ssid , A, i∗ ) from A to FOPRF (string prfx chosen by A for this Eval can be ignored), send (Eval, sid , ssid , A, S) to A on behalf of FOPRF and do in order: (a) If i∗ 6= S then send (Eval, sid , ssid , Fi∗ (x)) to A. (b) If i∗ = S and tx > 0, but there is no record hssid 0 , acti then output halt. (c) If i∗ = S and there are some records hssid 0 , acti then do in order: i. If there is record hssid 0 , acti which is not marked completed then choose ssid 0 of any such record, but if all records hssid 0 , acti are marked completed then choose ssid 0 of any of those. ii. Ignore this message if tx = 0, else set tx−− and send (TestPwd, sid , ssid 0 , S, x) to F. iii. If F returns “correct guess,” send (Compromise, sid , U) to SIMAKE , and on response (pu , Pu , Ps ) set rw ← SIMEQV ((pu , Pu , Ps )) and FS (x) := rw, record hfile, U, S, xi, and declare U compromised. iv. Send (Eval, sid , ssid , FS (x)) to A on behalf of FOPRF , and modify the chosen record hssid 0 , acti into hssid 0 , usedi. Fig. 10: Simulator SIM for SaPAKE of Fig. 8: OPRF Evaluation 35 AKE Simulation 1. For any ssid , as soon as hssid , U, prfxi is augmented to hssid , U, prfx, i∗ i and A sends (ssid , c∗ ) to U, retrieve file[sid ] = (·, ·, ·, c) and do one of the following: (a) If (c∗ , i∗ ) =(c, S), then send (TestAbort, sid , ssid , U) to F. If F replies Succ, record hssid Π , ssid , U, Si marked fresh, and send (UsrSession, sid , ssid Π , U, S) to SIMAKE for ssid Π := [ssid ||prfx]. (b) Otherwise for every x s.t. y = Fi∗ (x) is defined, check if AuthDecy (c∗ ) output parses as (p∗u , Pu∗ , Ps∗ ), and do one of the following: i. If there is no such x, send (TestPwd, sid , ssid , U, ⊥) followed by (TestAbort, sid , ssid , U) to F. ii. If there are more than one such x’s, output halt and abort. iii. If there is a unique such x, send (TestPwd, sid , ssid , U, x) to F. - If F replies “wrong guess,” send (TestAbort, sid , ssid , U) to F. - If F replies “correct guess,” do: (1) set (p∗u , Pu∗ , Ps∗ ) := AuthDecy (c∗ ); (2) run Πu on (p∗u , Pu∗ , Ps∗ ) and ssid Π = [ssid ||prfx]; (3) when Πu outputs SK ∗ , send (NewKey, sid , ssid , U, SK ∗ ) to F. 2. On all AKE-related interactions of A with all AKE sessions started by SIM’s SvrSession and UsrSession queries to SIMAKE above, pass all messages between A and SIMAKE , and react to messages sent by SIMAKE ’s interface with FAKE−KCI as follows: – On (Impersonate, sid , ssid Π , S), if S is declared compromised and there is record hssid Π , ssid , U, Si marked fresh, then mark it compromised and send (Impersonate, sid , ssid ) to F. – On (Impersonate, sid , ssid Π , U), if U is declared compromised and there is record hssid Π , ssid , S, Ui marked fresh, then mark it compromised, retrieve hfile, U, S, pwi and send (TestPwd, sid , ssid , S, pw) to F. – On (NewKey, sid , ssid Π , P, SK ∗ ), if there is a record hssid Π , ssid , P, P0 i not marked completed, do: • If the record is compromised, or P or P0 is corrupted, set SK := SK ∗ . • If the record is fresh, and SIM sent (NewKey, sid , ssid , P0 , SK 0 ) to F while record hssid Π , ssid , P0 , Pi was marked fresh, set SK := SK 0 . • Otherwise pick SK ←R {0, 1}τ . Finally, mark hssid Π , ssid , P, P0 i completed and send (NewKey, sid , ssid , P, SK) to F. Fig. 11: Simulator SIM for SaPAKE of Fig. 8: AKE Simulation 36 if there are x1 6= x2 such that A queries both y1 = Fi∗ (x1 ) and y2 = Fi∗ (x2 ), and AuthDecy1 (c∗ ) 6= ⊥ and AuthDecy2 (c∗ ) 6= ⊥. Here are throughout the proof below we say that “A queries Fi∗ (x),” where index i∗ may or may not be S, if (i) A sends (Eval, sid , ssid , x) and then (RcvComplete, sid , ssid , A, i∗ ) to FOPRF and if FOPRF replies to this query with Fi∗ (x) (note that if i∗ = S then FOPRF replies with FS (x) if and only if tx > 0, because FOPRF ’s record hssid , A, x, prfxi corresponding to A’s evaluation query can never satisfy prfx = OK), (ii) if A sends (OfflineEval, sid , i∗ , x) to FOPRF and if FOPRF replies to this query with Fi∗ (x) (note that if i∗ = S then FOPRF replies with FS (x) if and only if S is corrupted or compromised). This terminology is reused in subsequent games. Moreover, we refer to case (i) as “A queries Fi∗ (x) online,” and to case (ii) as “A queries Fi∗ (x) offline.” Note that y1 and y2 are independent random strings in {0, 1}2τ . Therefore, we can construct a reduction RRBST2 to the random-key robustness of AE where y1 and y2 are the challenge AE keys: RRBST2 picks a random pair (j1 , j2 ) where j1 , j2 ∈ {1, . . . , qF } and j1 < j2 10 (a guess that y1 and y2 are the results of A’s j1 -th and j2 -th queries), and runs the code of G1 except that it uses its input as the results of A’s j1 -th and j2 -th queries. In every sub-session RRBST1 checks if AuthDecrw (c∗ ) 6= ⊥ and AuthDecrw0 (c∗ ) 6= ⊥, and if so, it outputs c∗ (and breaks the game). We have that   G1 ,G2 qF DistZ ≤ Pr[halt] ≤ · AdvRBST,AE RRBST2 , 2 which is a negligible function of τ . Game G3 (user aborts if A does not compute rw0 , password match, and A does not change the OPRF index but changes c): In the case that (c∗ 6= c ∧ i∗ = S) and pw0 = pw, U outputs (abort, sid , ssid ) if A does not query Fi∗ (pw). Z’s views in G2 and G3 are identical unless A does not query rw = FS (pw) but on some U sub-session we have c∗ 6= c ∧ i∗ = S ∧ pw0 = pw and AuthDecrw (c∗ ) 6= ⊥: In this case U outputs (sid , ssid , SK) in G2 and (abort, sid , ssid ) in G3 . Since A does not query FS (pw), rw is a random string in {0, 1}2τ in Z’s view. Z additionally learns c ← AuthEncrw (pu , Pu , Ps ), but A’s message is restricted to c∗ 6= c. Therefore, we can construct a reduction RAUTH1 to the authenticity of AE where rw is the challenge AE key: RAUTH1 runs the code of G2 , except that it uses its encryption oracle to compute c ← AuthEncrw (pu , Pu , Ps ), and its decryption oracle to compute AuthDecrw (c∗ ) in every U sub-session (1) which runs on input pw0 = pw and (2) where the OPRF function index is i∗ = S. In each such sub-session RAUTH1 checks if c∗ 6= c and AuthDecrw0 (c∗ ) 6= ⊥, and if so, it outputs c∗ (and breaks the game). We have that DistG Z 2 ,G3 ≤ AdvAUTH,AE RAUTH1 , 10 To be precise, RRBST2 picks j10 ←R {1, . . . , qF }, j20 ←R {1, . . . , qF } \ {j10 }, and sets j1 := min(j10 , j20 ) and j2 := max(j10 , j20 ). 37 which is a negligible function of τ . Game G4 (user aborts if A does not compute rw0 , and either passwords do not match or A changes the OPRF index): In the case that (c∗ 6= c ∧ pw0 6= pw) ∨ i∗ 6= S, U outputs (abort, sid , ssid ) if A does not query Fi∗ (pw0 ). Z’s views in G3 and G4 are identical unless on some U sub-session (c∗ 6= c ∧ pw 6= pw) ∨ i∗ 6= S, and A does not query rw0 = Fi∗ (pw0 ) but AuthDecrw0 (c∗ ) 6= 0 ⊥: In this case U outputs (sid , ssid , SK) in G3 and (abort, sid , ssid ) in G4 on that sub-session. Call the event that such U sub-session exists E. For each i ∈ {1, . . . , qU } define Ej as the event that on the j-th U sub-session, in the order determined by the initialization calls from Z, it holds that (1) (c∗ 6= c ∧ pw0 6= pw) ∨ i∗ 6= S, (2) A does not query Fi∗ (pw0 ), (3) this is the first occurrence of pair (i∗ , pw0 ) on any U sub-session, and (4) AuthDecrw0 (c∗ ) 6= ⊥. Note that E is the union of events Ej for j = 1, . . . , qU . Since for (i∗ , pw0 ) in the j-th U sub- session A does not query Fi∗ (pw0 ), rw0 is not used anywhere else (in particular, Z learns c ← AuthEncrw (pu , Pu , Ps ), but rw is independent of rw0 ) and hence is a random string in {0, 1}2τ in Z’s view. Therefore, for each Ej we can construct a reduction RAUTH2,j to the authenticity of AE where rw0 in the j-th U sub-session is the challenge AE key: RAUTH2,j runs the code of G3 . In the j-th U sub-session, RAUTH2,j uses the decryption oracle to check if Ej occurs, and if so, it outputs c∗ (and breaks the game). We have that qU X DistG Z 3 ,G4 ≤ Pr[E] ≤ Pr[Ej ] ≤ qU · AdvAUTH,AE RAUTH2 , j=1 which is a negligible function of τ . Note that the combined conditions introduced in G3 and G4 are equivalent to the following: In the case that ¬(c∗ = c ∧ i∗ = S), U outputs (abort, sid , ssid ) if A does not query Fi∗ (pw0 ). Game G5 (extract A’s password guess on U interactions): In G4 , after U computes rw0 = Fi∗ (pw0 ) and receives (ssid , c∗ ) and (Prefix, ssid , prfx), it tests if AuthDecrw (c∗ ) can be parsed as (p∗u , Pu∗ , Ps∗ ), and either runs Πu on these decrypted AKE keys and ssid Π := [ssid ||prfx], or outputs (abort, sid , ssid ) if the parsing fails. Here we replace the above with the following (ssid Π does not change from G4 to G5 , so we omit it in the description of G5 below): 1. If c∗ = c ∧ i∗ = S, then do: (I) if pw0 = pw (which is case (∗)), then U runs Πu on inputs (pu , Pu , Ps ); (II) otherwise U outputs (abort, sid , ssid ). 2. If ¬(c∗ = c ∧ i∗ = S), and there are x1 6= x2 such that A queries both y1 = Fi∗ (x1 ) and y2 = Fi∗ (x2 ), and AuthDecy1 (c∗ ) 6= ⊥ and AuthDecy2 (c∗ ) 6= ⊥, output halt and abort the entire game. 3. If ¬(c∗ = c ∧ i∗ = S) and A queries rw0 = Fi∗ (x) for a unique x such that AuthDecrw0 (c∗ ) can be parsed as (p∗u , Pu∗ , Ps∗ ), then do: (I) if x = pw0 (which is case (∗∗)), then U runs Πu on inputs (p∗u , Pu∗ , Ps∗ ); (II) otherwise U outputs (abort, sid , ssid ). 38 4. Otherwise, i.e., ¬(c∗ = c ∧ i∗ = S) but A makes no Fi∗ (x) query as in case 2 or 3 above, U outputs (abort, sid , ssid ). We argue that this modification does not change Z’s view. First consider the case that c∗ = c ∧ i∗ = S. In G4 , if pw0 = pw, then U runs Πu on AuthDecrw0 (c∗ ) = AuthDecrw (c) = (pu , Pu , Ps ), which is replicated in case 1(I) of G5 ; otherwise U outputs (abort, sid , ssid ) by the condition introduced in G1 , which is replicated in case 1(II) of G5 . Now consider the case that ¬(c∗ = c ∧ i∗ = S). Then if case 2 occurs, i.e., A queries two distinct Fi∗ outputs which both decrypt c∗ , then G4 outputs halt by the condition introduced in G2 , which is the same as in G5 . If A makes no Fi∗ (pw0 ) query, then U outputs (abort, sid , ssid ) by the conditions introduced in G3 and G4 , which is replicated in cases 3(II) and 4 of G5 . The only remaining case is that A queries Fi∗ (pw0 ) and this is the unique query such that AuthDec0rw (c∗ ) can be parsed as (p∗u , Pu∗ , Ps∗ ), in which case in G4 U runs Πu on (p∗u , Pu∗ , Ps∗ ), which is replicated in case 3(I) in G5 . It follows that G4 ,G5 DistZ = 0. Comparison of G5 and the Simulated World. We argue that in G5 , when A sends (ssid , c∗ ) aimed at U and decides on the index i∗ for which U computes Fi∗ , the way that the game emulates U’s response to (c∗ , i∗ ) is the same as in the simulated world, except that G5 runs the AKE protocol Πu on behalf of U in case this user instance encounters either case (∗) or (∗∗), while the simulator SIM executes Πu only in case (∗∗), while in case (∗) the execution of Πu is replaced by a simulation by SIMAKE . Disregarding the differences due to Πu execution vs. Πu simulation, the simulation of U instances acts based on the following two cases: (i) If c∗ = c ∧ i∗ = S then SIM sends (TestAbort, sid , ssid , U) to F. • If F returns Succ, i.e., pw0 = pw11 then SIM proceeds to simulate Πu . We call this case (∗), and it corresponds to case 1(I) in G5 , while SIM handles it in step (1a) in Fig. 11. • If F returns Fail, i.e., pw0 6= pw, F sends (abort, sid , ssid ) to U (who outputs this message). This corresponds to case 1(II) in G5 , while SIM handles it in the same step as above. (ii) If ¬(c∗ = c ∧ i∗ = S), then for every x such that y = Fi∗ (x) was queried by A, SIM checks if AuthDecy (c∗ ) can be parsed as (p∗u , Pu∗ , Ps∗ ). • If there are two or more such x’s, i.e., x1 6= x2 s.t. A queries both y1 = Fi∗ (x1 ) and y2 = Fi∗ (x2 ), and AuthDecy1 (c∗ ) 6= ⊥ and AuthDecy2 (c∗ ) 6= ⊥, then SIM outputs halt and aborts. This corresponds to case 2 in G5 , and SIM handles it in in step (1(b)ii) in Fig. 11. 11 Note that our UC saPAKE functionality F does not check if S runs a session with a sub-session ID ssid matching the tested session of U. Indeed, our protocol allows the adversary to test if pw0 = pw without regards to the ssid on the U’s tested session. 39 • If there is a unique such x, then SIM sends (TestPwd, sid , ssid , U, x) to F. If F returns “correct guess,” i.e., x = pw0 , SIM runs Π on the decrypted values (p∗u , Pu∗ , Ps∗ ) ← AuthDecy (c∗ ). We call this case (∗∗), it corresponds to case 3(I) in G5 , and SIM handles it in step (1(b)iii). • If F returns “wrong guess,” i.e., x 6= pw0 , then SIM sends (TestAbort, sid , ssid , U) to F, and F sends (abort, sid , ssid ) to U (who outputs this message). This corresponds to case 3(II) in G5 , and SIM handles it in the same step as above. • If there is no such x, then SIM sends (TestPwd, sid , ssid , U, ⊥) and then (TestAbort, sid , ssid , U) to F, and F sends (abort, sid , ssid ) to U (who outputs this message). This corresponds to case 4 in G5 , and SIM handles it in step (1(b)i). We can see that if we omit the interaction between SIM and F above, and view SIM and F combined as the game challenger who interacts with Z and A, then the behavior of this game challenger when A sends (ssid , c∗ ) aimed at U is exactly the same with the behavior of G5 , except for Πu execution replaced by Πu simulation in case (∗). In the next four games (G5 – G9 ) we replace AKE credential generation and login protocol execution with the simulation by SIMAKE . Game G6 (outsource the generation of c and rw to SIMEQV ): At the beginning of the game, let SIMEQV simulate c and leave rw undefined, and let SIMEQV “open” rw when A computes it. Concretely, (1) At the beginning of the game, set c ← SIMEQV (kpr + 2kpb ); (2) When A queries FS (pw), set rw ← SIMEQV (pu , Pu , Ps ). Observe that in G5 , Z sees c ← AuthEncrw (pu , Pu , Ps ), and unless and until A queries FS (pw) (and thus learns rw), rw is not used by any party except in generating c, hence is a random string in {0, 1}2τ independent of everything else (except for c) in Z’s view. In particular, in G5 U does not evaluate F on any input, and all processing is based on whether c∗ = c ∧ i∗ = S, whether pw0 = pw, and on A’s queries to Fi∗ , which in the case i∗ = S are A’s queries to FS . Therefore, in G5 c followed by rw in case A queries FS (pw), are formed as in the “real game” in the encryption equivocability experiment for AE, where A sees the encryption c of (pu , Pu , Ps ) under key rw followed by the key rw (in case of A’s query to FS (pw)). On the other hand, in G6 ciphertext c followed by key rw are formed as in the “ideal game” in the encryption equivocability experiment for AE. Therefore, we can construct a reduction REQV to the equivocability of AE: REQV runs the code of G5 except that it uses its input as c and rw, and copies Z’s output. We have that G5 ,G6 EQV,AE DistZ ≤ AdvR EQV , which is a negligible function of τ . Game G7 (outsource the generation of pu , Pu , ps , Ps to SIMAKE ): G7 lets SIMAKE generate the two parties’ key pairs in the AKE protocol Π, pu , Pu , 40 ps , Ps , instead of generating them on its own. Concretely, at the beginning of the game, send (Compromise, sid , S) and (Compromise, sid , U) to SIMAKE and obtain pu , Pu , ps , Ps ; ignore all subsequent messages from SIMAKE . Clearly, an environment distinguishing between G6 and G7 can be turned into an environment ZAKE1 distinguishing between the real execution of Π and the simulation of Π by SIMAKE . Therefore, G6 ,G7 Π,{F ,SIMAKE } DistZ ≤ DistZAKE1AKE−KCI , Π,{F ,SIM } where DistZAKE1AKE−KCI AKE denotes the distinguishing advantage of ZAKE1 between the real execution of Π and the simulation of Π by SIMAKE , and is a negligible function of τ . Game G8 (leave pu , Pu , ps , Ps undefined until they are used): At the beginning of the game, do not send (Compromise, sid , S) or (Compromise, sid , U) to SIMAKE , and leave pu , Pu , ps , Ps undefined. However, (1) When A sends (StealPwdFile, sid ) to S, send (Compromise, sid , S) to SIMAKE to obtain (ps , Ps , Pu ); (2) When A queries FS (pw), send (Compromise, sid , U) to SIMAKE to obtain (pu , Pu , Ps ). Observe that in G7 , ps is not used unless and until A sends (StealPwdFile, sid ) to S (at which time the game challenger must send ps at part of its response file[sid ]); therefore, postponing generating ps to the time when A sends (StealPwdFile, sid ) to S does not change the game. Similarly, pu is not used unless and until A queries FS (pw) (at which time the game challenger must invoke SIMEQV (pu , Pu , Ps ) to generate rw as the response to A); therefore, postponing generating pu to the time when A queries FS (pw) does not change the game. We have that G7 ,G8 DistZ = 0. Comparison of G8 and the Simulated World. We argue that in G8 , pu , Pu , ps , Ps , rw and c are generated in the same way as in the simulated world. In the simulated world SIM sets c ← SIMEQV (kpr + 2kpb ) and pu , Pu , ps , Ps and rw are undefined until one of the two cases happen: Case 1: When the adversary compromises the server, i.e. when A sends (StealPwdFile, sid ) to S (step 1 of “Stealing Password Data and Offline Queries”), SIM sends (Compromise, sid , S) to SIMAKE to obtain (ps , Ps , Pu ). Case 2: When the adversary makes either a successful password test attack. This can happen in one of the following two ways. First, if A queries FS (pw) offline (step 2 of “Stealing Password Data and Offline Queries”; note that A can query FS (pw) offline only after compromising S), SIM sends (OfflineTestPwd, sid , pw) to F, which replies “correct guess” (because pw is correct). Second, if A queries FS (pw) online (step 4 of “OPRF Evaluation”), SIM checks if the FOPRF ticket counter tx is non-zero (recall that SIM emulates FOPRF ), and if so then SIM sends (TestPwd, sid , ssid 0 , pw) to F where ssid 0 is 41 a sub-session ID of some S session for which SIM holds record ssid 0 , act . Since password pw is correct F will reply “correct guess” if F holds a server session record ssid 0 , S, U, pw s.t. dPT(ssid ) = 1. In either case, given F’s response “correct guess”, SIM declares U compromised, sends (Compromise, sid , U) to SIMAKE , and on SIMAKE ’s response (pu , Pu , Ps ), SIM gets rw ← SIMEQV (pu , Pu , Ps ) and sets FS (pw) := rw. Observe that this interaction creates the same view to Z and A as G8 does, at least with regards to A’s view in case A evaluates FS (pw), assuming that in the online query case SIM never encounters the case that A queries FS (pw) online but either (1) tx = 0 or (2) tx > 0 but SIM holds no record ssid 0 , act or (3) SIM holds a record ssid 0 , act but F does not hold a record ssid 0 , S, U, pw s.t. dPT(ssid 0 ) = 1. Below we argue that this event cannot happen, and consequently the simulator SIM interacting with functionality F creates exactly the view that G8 does in the case A evaluates FS (pw). Note that SIM emulates FOPRF , and in particular it increments tx at each SndrComplete with prfx0 that does not match any U’s evaluation record hssid , U, x, prfxi, and it decrements it whenever tx > 0 and A sends (RcvComplete, sid , ssid , U, S) where hssid , U, x, prfxi is one of such unmatched U records, or A sends (RcvComplete, sid , ssid , A, S) corresponding to some A’s record hssid , A, x, prfxi. This is the same as FOPRF does, so tx in SIM’s emulation of FOPRF has always the same value as in FOPRF , and if FOPRF replies to A’s online query FS (x) then event (1) cannot happen in the simulation. Next, note that when SIM increments tx at some (SndrComplete, sid , ssid 0 , S) query, see step (2a) in Fig. 10, it marks this S session as actively attacked by recording ssid 0 , act , and the only way SIM can change this record to ssid 0 , used , see step (4(c)iv), is when it sends FS (x) and decrements tx. Therefore if tx > 0 then there must be some S sessions whose status is act, thus event (2) cannot happen in the simulation. Finally, note that when SIM records ssid 0 , act it sends (Interrupt, sid , ssid 0 , S) to F, see step (2a), at which point F sets dPT(ssid 0 ) := 1, and the only way F sets dPT(ssid 0 ) := 0 is if SIM sends (TestPwd, sid , ssid 0 , S, ·) to F. Since SIM sends such TestPwd query, in step (4(c)ii), only if it holds record ssid 0 , act , event (3) also cannot happen. Now the only difference between G8 and the simulated world lies in the simulation of Π. Game G9 (outsource to SIMAKE the simulation of all Πu instances of case (∗) and of all Πs instances): Replace each execution of Πs with their simulation by SIMAKE , and replace the executions of Πu with their simulation by SIMAKE for each all U sub-sessions which fall into case (∗), i.e., sub-sessions where c∗ = c ∧ i∗ = S ∧ pw0 = pw. Specifically, modify the game in case (*) as follows: 1. When U’s OPRF sub-session is completed and A sends c∗ (= c) to U, send (UsrSession, sid , ssid Π , U, S) to SIMAKE for ssid Π = [ssid ||prfx] where prfx 42 was determined by A in the Eval handling of this U sub-session, and record hssid Π , ssid , U, Si marked fresh; 2. When Z inputs (SvrSession, sid , ssid ) to S, send (SvrSession, sid , ssid Π , U, S) to SIMAKE for ssid Π = [ssid ||prfx] where prfx was determined by A in the SndrComplete handling of this S sub-session, and record hssid Π , ssid , S, Ui marked fresh; 3. On (Impersonate, sid , ssid Π , P) from SIMAKE , if there is a record hssid Π , ssid , P0 , Pi marked fresh and G9 sent (Compromise, sid , P) to SIMAKE before, mark this record compromised; 4. While SIMAKE simulates Π instances, pass messages between SIMAKE and A; 5. On (NewKey, sid , ssid Π , P, SK ∗ ) from SIMAKE , if there is a record hssid Π , ssid , P, P0 i not marked completed, do: – If the record is compromised, or P or P0 is corrupted, set SK := SK ∗ . – Else if the record is marked fresh, a (sid , ssid , SK 0 ) tuple was sent to P0 while record hssid Π , ssid , P0 , Pi was fresh, set SK := SK 0 . – Else pick SK ←R {0, 1}τ . Finally, mark hssid Π , ssid , P, P0 i completed and send (sid , ssid , SK) to P. Clearly, an environment distinguishing between G8 and G9 can be turned into an environment ZAKE2 distinguishing between the real execution of Π and the simulation of Π by SIMAKE . Therefore, G8 ,G9 Π,{F ,SIMAKE } DistZ ≤ DistZAKE2AKE−KCI , which is a negligible function of τ . Comparison of G9 and the Simulated World. We argue that G9 is identical to the simulated world, i.e., to the ideal world interaction where the game challenger is split into the simulator SIM and the SaPAKE functionality F. Note that G9 decides in the same way as G5 whether a U sub-session results in U outputting (abort, sid , ssid ) or falls into cases (∗) and (∗∗), and it generates the AKE keys pu , Pu , ps , Ps and the AE ciphertext c in the same way as in G8 , and we argued above that G5 and G8 execute these parts in the same way as SIM interacting with F. The remaining part is to argue that G9 also emulates SIM interacting with F with respect to the session keys SK output by U and S.12 The case that either U or S is corrupted are easiest to see because in that case F passes the key received by SIM to the corresponding party, thus below we assume that neither U nor S is corrupted. Consider first U’s output SK in case (∗∗). In G9 , SK is determined by the output of protocol Πu executed on behalf of U on inputs (p∗u , Pu∗ , Ps∗ ), which are in turn determined by (c∗ , i∗ ) and A’s queries to Fi∗ , as described in G5 . In the simulated world, as we argued in G5 , SIM runs Πu on the same inputs, hence SK computed by SIM is identically distributed. At the end of Πu , SIM sends 12 Recall that S’s output is always of the form (sid , ssid , SK), while U’s output is (sid , ssid , SK) in cases (∗) and (∗∗), and (abort, sid , ssid ) otherwise; but we argued that these cases are handled in the simulated world in the same way as in game G5 . 43 (NewKey, sid , ssid , U, SK) to F, who will pass SK in message (sid , ssid , SK) to U because case (∗∗) happens only if SIM sends (TestPwd, sid , ssid , U, x) to F (see step (1(b)iii) in Fig. 11) and F replies “correct guess,” at which point F marked this SaPAKE-layer sub-session record hssid , S, U, pwi compromised. Secondly, consider all Π executions which are outsourced to SIMAKE in G9 , i.e., instances of Πu which fall into case (∗) and all instances of Πs . In G9 , SK output by party P ∈ {U, S} is determined by (1) the status of record hssid Π , ssid , P, P0 i kept for this sub-session, (2) SIMAKE ’s message (NewKey, sid , ssid Π , P, SK ∗ ), and (3) whether (sid , ssid , SK 0 ) was sent to P0 at the time there was a fresh record hssid Π , ssid , P0 , Pi. Game G9 uses the same factors to decide on P’s output by emulating functionality FAKE−KCI . In the simulated world, SK determined by SIM is identically distributed, because SIM also emulates FAKE−KCI and uses the same rules to determine the status of each AKE-layer session, hence factors (1)-(3) play exactly the same role in the simulated world. However, similarly to case (∗∗) discussed above, SIM does not output message (sid , ssid , SK) directly to P, but sends (NewKey, sid , ssid , P, SK) to F, who then “post-processes” these keys, using its own records for these sub-sessions, resp. hssid , P, P0 , pw◦ i and hssid , P0 , P, pw◦◦ i. We argue that this post-processing by F always implements the same logic for determining SK on a given sub-session as SIM does. Specifically, we argue that the following three invariants hold: 1. If SIM passes SIMAKE ’s key SK ∗ to F, i.e., if the AKE-layer sub-session record hssid Π , ssid , P, P0 i is compromised, then the SaPAKE-layer sub-session record hssid , P, P0 , pw◦ i is either compromised, or, if P = S, it is interrupted but flag = compromised. 2. If there are two AKE-layer sub-session records with matching AKE-layer sub-session ID’s, i.e., hssid Π , ssid , P, P0 i and ssid 0Π , ssid 0 , P0 , P such that ssid Π = ssid 0Π , then it holds that (a) their SaPAKE-layer sub-session ID’s match as well, i.e., ssid = ssid 0 , and (b) the passwords in the corresponding SaPAKE-layer sessions also match, i.e., F records for these sub-sessions, hssid , P, P0 , pw◦ i and ssid 0 , P0 , P, pw◦◦ , satisfy pw◦ = pw◦◦ . 3. If two AKE-layer sub-sessions hssid Π , ssid , P, P0 i and ssid 0Π , ssid 0 , P0 , P are “connected” by the FAKE−KCI emulated by G9 (and by SIM), in the sense that step 5 in FAKE−KCI emulation in G9 (and step 2 in Fig. 11) output the same key SK to both sessions, then the corresponding SaPAKE-layer sub-sessions hssid , P, P0 , pw◦ i and hssid , P, P0 , pw◦ i are fresh. Invariant (1) implies that if AKE-layer sub-session record hssid Π , ssid , P, P0 i is compromised then F will pass SK ∗ output by SIMAKE to P, hence key SK output by P in the simulated world is the same as in G9 . Invariants (2) and (3) together imply that if AKE-layer sub-session records hssid Π , ssid , P, P0 i and ssid Π , ssid 0 , P0 , P (assume w.l.o.g. that the latter sub-session completes first) output the same key SK, chosen at random either by G9 or by SIM in the FAKE−KCI emulation, then F will replicate this behavior: First, when hssid , P0 , P, pw◦◦ i completes, F picks a random key SK 0 ← {0, 1}τ as SK because, by invariant (3) this SaPAKE-layer sub-sessions 44 is fresh. Second, when hssid , P, P0 , pw◦ i completes, F will assign to it the same key SK 0 because, by invariant (3) that session will also be fresh, and by invariant (2) since these two sub-sessions have the same AKE-layer ID’s ssid Π (otherwise they wouldn’t be connected on the AKE-layer), then their SaPAKE-layer ID’s match too, i.e., ssid = ssid 0 , and so do their passwords, i.e., pw◦ = pw◦◦ . In all other cases both G9 and SIM pick a random key SK for session hssid Π , ssid , P, P0 i, therefore in the simulated world regardless if F passes that key to P, or it replaces it with a new choice SK 0 of a random key, party P outputs (sid , ssid , SK 0 ) for a random key SK 0 , which matches the distribution of its outputs in G9 We argue that the three invariants indeed hold. We start from invariant (1). Note that a fresh AKE-layer session hssid Π , ssid , P, P0 i turns compromised if SIMAKE sends (Impersonate, sid , ssid Π , P0 ) and P0 is declared compromised. Consider case P0 = S first. S is declared compromised by SIM (and G9 ) only if A sends (StealPwdFile, sid ), see step 1 in Fig. 9, in which case F marks the password file compromised. If SIMAKE then sends (Impersonate, sid , ssid Π , S) then SIM sends (Impersonate, sid , ssid ) to F, at which point F marks the SaPAKE-layer session hssid , U, S, pw0 i as compromised if this SaPAKE-layer session is fresh. However, note that U session in case (∗) does not start unless F replies Succ to SIM’s query (TestAbort, sid , ssid , U), see step (1a) in Fig. 11, which means that the SaPAKE-layer U session remained fresh after SIM’s TestAbort query, and hence it became compromised after SIM’s Impersonate query. Consider now the case when P0 = U. If U is declared compromised then A queried FS (pw), either offline or online, so SIM holds a record hfile, U, S, pwi, hence if SIMAKE sends (Impersonate, sid , ssid Π , U) then SIM sends (TestPwd, sid , ssid , S, pw) to F, and since the tested password is correct, F will process the SaPAKE-layer session hssid , S, U, pwi as follows: If this SaPAKE-layer session was fresh then it will become compromised, and if it was interrupted then it will remain interrupted. However, note that if A queries FS (pw) either offline or online, this means that either some OfflineTestPwd query to F in step 1, Fig. 9, or some TestPwd query to F in step 4c, Fig. 10, received F’s response “correct guess”, but at this point F sets flag := compromised. This means that SaPAKE-layer S session is either compromised or it is interrupted but flag = compromised, as claimed. As for invariant (2), part (a) is immediate because ssid Π is formed as [ssid ||prfx] on each session, so equality of ssid Π ’s implies equality of ssid ’s. As for part (b), note that U session in case (∗) runs only if TestAbort does not make it abort, see step 1a, Fig. 11, which means that pw0 = pw. We turn to invariant (3). Note that FAKE−KCI emulation, by either G9 or SIM, connects these two AKE-layer session only if their AKE-layer ssid’s match, i.e. ssid Π = ssid 0Π . Note also that ssid Π == [ssid ||prfx] and ssid Π == [ssid 0 ||prfx0 ], which implies that ssid = ssid 0 and that the OPRF transcript prefixes of this U and S sessions matched as well, i.e. prfx = prfx0 . Note that U’s AKE-layer session hssid Π , ssid , U, Si starts fresh in case (∗), and if that session remains fresh 45 until NewKey message for it (as must be the case for FAKE−KCI to “connect” it to the S session), then SIM does not send to F any queries which would change the status of the SaPAKE-layer session, i.e. the corresponding SaPAKE-layer session stays fresh. Regarding S’s AKE-layer session ssid 0Π , ssid 0 , S, U , note that when this session starts, in step 2a in Fig. 10, if prfx = prfx0 then SIM does not write record hssid , acti. Consequently SIM does not send Interrupt for that session to F, and also SIM will never choose that session in step 4(c)i, Fig. 10, and hence will not send TestPwd for that session to F in step 4(c)ii. (These are all consequences of the fact that if OPRF transcript prefixes match then SIM cannot, and does not, use this S session to evaluate S’s random function FS .) Consequently, the corresponding SaPAKE-layer session stays fresh as well, as we claimed. Summing up all results above, we conclude that Z’s distinguishing advantage between the real world and the simulated world is a negligible function of the security parameter τ , which completes the proof. 6 OPAQUE: A Strong Asymmetric PAKE Instantiation Fig. 12 shows OPAQUE, a concrete instantiation of the generic OPRF+AKE protocol from Fig. 8. The OPRF is instantiated with the DH-OPRF scheme from [31] recalled in Appendix B, while the AKE protocol can be instantiated with any AKE protocol that realizes the FAKE−KCI functionality from Fig. 7. In Fig. 12 this is illustrated with HMQV [38]. Note that the two messages of DH- OPRF and the first two messages from HMQV run “in parallel” hence the OPRF does not add to the total number of messages exchanged in the protocol. In this case, the number of messages is three as needed for HMQV to instantiate the FAKE−KCI functionality (see Section 5.1). The third component of the protocol is a random-key robust authenticated encryption scheme AE which can accommodate multiple instantiations as discussed below. By Theorem 2 on the security of the generic OPRF+AKE construction, by Lemma 1 in Appendix B on the security of DH-OPRF, the security of HMQV as AKE-KCI, and assuming a random-key robust and equivocable instantiation of AE, we get that protocol OPAQUE realizes functionality FsaPAKE . Hence it is a provably-secure Strong aPAKE protocol, under the One-More Diffie-Hellman assumption [5, 31] in the random oracle model. A similar result can be argued on the basis of the SIGMA protocol from [37] – see Section 5.1. 6.1 Protocol Details and Properties We expand on the specification of OPAQUE and the protocol’s properties. • Password registration. Password registration is the only part of the protocol assumed to run over secure channels where parties can authenticate each other. We note that while OPAQUE is presented with S doing all the registration operations, in practice one may want to avoid that. Instead, we can let S choose an OPRF key ks and U choose pw, and then run the OPRF protocol between 46 Public Parameters and Components – Security parameter τ – Group G of prime order q, |q| = 2τ and generator g (G∗ denotes G \ {1}). – Hash functions H(·, ·), H 0 (·) with ranges {0, 1}2τ and G, respectively. – Pseudorandom function (PRF) f (·) with range {0, 1}2τ . – OPRF function defined as Fk (x) = H(x, (H 0 (x))k ) for key k ∈ Zq . – Random-key robust authenticated encryption scheme (AuthEnc, AuthDec). – Key exchange formula KE defined below. Password Registration 1. (StorePwdFile, sid , U, pw): S computes ks ←R Zq , rw := Fks (pw), ps ←R Zq , pu ←R Zq , Ps := g ps , Pu := g pu , c ← AuthEncrw (pu , Pu , Ps ); it records file[sid ] := hks , ps , Ps , Pu , ci. Login 1. (UsrSession, sid , ssid , S, pw): U picks r, xu ←R Zq ; sets α := (H 0 (pw))r and Xu := g xu ; sends α and Xu to S. 2. (SvrSession, sid , ssid ): On input α from U, S proceeds as follows: (a) Checks that α ∈ G∗ . If not, outputs (abort, sid , ssid ) and halts; (b) Retrieves file[sid ] = hks , ps , Ps , Pu , ci; (c) Picks xs ←R Zq and computes β := αks and Xs := g xs ; (d) Computes K := KE(ps , xs , Pu , Xu ) and sets: ssid 0 := H(sid , ssid , α), SK := fK (0, ssid0 ), As = fK (1, ssid0 ); (e) Sends β, Xs , c and As to U; 3. On input β, Xs , c and As from S, U proceeds as follows: (a) Checks that β ∈ G∗ . If not, outputs (abort, sid , ssid ) and halts; (b) Computes rw := H(pw, β 1/r ); (c) Computes AuthDecrw (c). If the result is ⊥, outputs (abort, sid , ssid ) and halts. Otherwise sets (pu , Pu , Ps ) := AuthDecrw (c); (d) Computes K := KE(pu , xu , Ps , Xs ) and sets: ssid 0 := H(sid , ssid , α), SK := fK (0, ssid0 ), As = fK (1, ssid0 ), Au = fK (2, ssid0 ); (e) Verifies that As is same as received from S. If not, it outputs (abort, sid , ssid ) and halts. (f) Sends Au to S and outputs (sid , ssid , SK). 4. On input Au from U, S verifies that Au = fK (2, ssid0 ). If not, it outputs (abort, sid , ssid ) and halts; else it outputs (sid , ssid , SK). / G∗ Key exchange formula KE with HMQV instantiation (if any of Xu , Pu , Xs , Ps ∈ the receiving party outputs (abort, sid , ssid ) and halts) For S: KE(ps , xs , Pu , Xu ) = H (Xu Pueu )xs +es ps  For U: KE(pu , xu , PS , XS ) = H (Xs Pses )xu +eu pu  where eu = H(Xu , S, ssid 0 ) mod q, es = H(Xs , U, ssid 0 ) mod q. Fig. 12: Protocol OPAQUE with DH-OPRF and HMQV 47 U and S so only U learns its secrets (pw, rw, pu ). The server chooses its pair (ps , Ps ) and provides Ps to U who builds the ciphertext c and sends to S. In this way the server never sees the user’s password, a major benefit, for example to avoid accidental storage of plaintext passwords that has affected also security- conscious companies [2, 3]. Note that this prevents S from checking password rules, an operation that can be moved to the client side (restricted server-side checks such as preventing the repeat of a recent password can be implemented). • Authenticated encryption. As specified in Section 5.2, the authenticated encryption scheme AE used in the protocol needs to satisfy the random-key robustness property defined there. In practice, using an encrypt-then-mac scheme with HMAC-256 (or larger) as the MAC provides this property (in particular, any AE scheme can be made robust by computing HMAC on top of the ciphertext output by AuthEnc). We note that the standard GCM mode for authenticated encryption is not random-key robust but it can be adapted to achieve this property. • Key exchange and forward secrecy. The generic AKE representation in Fig. 12 via the KE formula is done for simplicity and since it applies to HMQV and, more generally, to protocols that follow the implicit authentication approach. Other protocols may require additional operations, such as signatures in the case of the SIGMA as mentioned above. It follows from our analysis that any KE protocol used with OPAQUE must resist KCI attacks and enjoy full forward secrecy (against active attacks). The latter condition implies that OPAQUE must have at least three messages13 . • User iterated hashing. OPAQUE can be strengthened by increasing the cost of a dictionary attack in case of server compromise. This is done by changing the computation of rw to rw = H n (Fk (pw)), that is, the client applies n iterations of the function H on top of the result of the OPRF value Fk (pw). In practice, the iterations H n would be replaced with one of the standard password-based key derivation functions, such as PBKDF2 [36] or bcrypt [46], or by more modern memory-hard functions such as Argon2 [10] or Scrypt [44]. This forces an attacker that compromises the password file at the server to compute for each candidate password pw0 the function Fk (pw0 ) as well as the additional n hash iterations. Note that n needs not be remembered by the user; it can be sent from S to U in the server’s message. Furthermore, one can follow Boyen’s design and apply the probabilistic Halting KDF function [11] as used in [12] so that the iterations count is hidden from the attacker and even from the server. An additional benefit of client-side hardening is that not only it slows down offline attacks upon server compromise but also online password-guessing attacks. On the other hand, clients running on weak machines are limited in the amount of hardening they can apply. 13 To achieve full forward secrecy one of the client messages must depend on the user’s private key [38]. So at the minimum one needs a first message from the client with user account information, followed by a message from the server with the user’s envelope, and a third from the client that depends on the user’s private key. 48 • Performance. OPAQUE takes three messages, one exponentiation for S, two and a hashing-into-G for U, plus the cost of KE. With HMQV, the latter cost is one offline fixed-base exponentiation and one multi-exponentiation (at the cost of 1.17 regular exponentiations) per party (about three exponentiations in total for the server and four for the user). All exponentiations are in regular DH groups, hence accommodating the fastest elliptic curves (e.g., no pairings). It is common in PAKE protocols to count number of group elements transmitted between the parties. In OPAQUE, U sends two while S sends three (one, Pu , can be omitted at the cost of one fixed-based exponentiation at the client). See also Section 6.3. • Performance comparison. The introduction presents background on OPAQUE and other password protocols. Here we provide a comparison with the more efficient among these protocols, particularly those that are being, or have been, considered for standardization. Clearly, OPAQUE is superior security-wise as the only one not subject to pre-computation attacks, but it also fares well in terms of performance. AugPAKE [49, 50], is computationally very efficient with only 2.17 exponentiations per party; however, it uses 4 messages and does not provide forward secrecy. In addition, the protocol has only been analyzed as a PAKE protocol, not aPAKE [50]. Another proposed aPAKE protocol, SPAKE2+ [4, 19], uses two messages only and 3 multi-exponentiations (or about 3.5 exponentiations) per party which is similar to OPAQUE cost. The security of the protocol has only been informally argued in [19] and to the best of our knowledge no formal analysis has appeared. We also mention SRP which has been included in TLS ciphersuites in the past but is considered outdated as it does not have an instantiation that works over elliptic curves (the protocol is defined over rings and uses both addition and multiplication). Its implementations over RSA moduli is therefore less efficient than those over elliptic curve; it also takes 4 messages. Recently, a few protocols have been presented with proofs of aPAKE security but, as the rest, they are vulnerable to pre-computation. The protocol VTBPEKE in [45] uses 3 messages and 4 exponentiations per party and was proven secure in the non-UC aPAKE model of [9], while AuCPace [28] requires 4 messages and 4 (resp. 3) exponentiations for the server (resp. client) and is proven in the UC aPAKE model of [26]. Also proven in this model is [34], a simultaneous one-round scheme that works over bilinear groups and requires 4 exponentiations and 3 pairing per party. We note that the above protocols require an initial message from server to user in order to transmit salt, which may result in one or two added messages to the above message counts (except for VTBPEKE which already includes the salt transmission in its 3 messages). All these protocols, like OPAQUE, work in the RO model. • Threshold implementation. We comment on a simple extension of OPAQUE that can be very valuable in large deployments, namely, the ability to implement the OPRF phase as a Threshold OPRF [32]. In this case, an attacker needs to break into a threshold of servers to be able to impersonate the servers to the user or to run an offline dictionary attack. Such an implementation requires no user- 49 side changes, i.e., the user does not need to know if the system is implemented with one or multiple servers. • OPAQUE as a general secret retrieval mechanism. An important feature of OPAQUE is that it can serve not only as an aPAKE protocol but more generally as a means for retrieving a secret or credential from a server (such a secret is protected under ciphertext c stored at the server). In this functionality, OPAQUE acts as a 1-out-of-1 implementation of the PPSS scheme from [32]. The retrieved secret can be used to protect information such as a bitcoin wallet, serve as a user- controlled encryption key for a backup or other information repository (e.g., a password manager), used as an authentication or signing key, and more. This offers a far more secure alternative to the practice of deriving low-entropy secrets directly from a user’s password. 6.2 OPAQUE and TLS: Client authentication and hedging against PKI failures As discussed earlier, OPAQUE offers a much more secure alternative to password-authenticated key exchange than the current practice of transmitting passwords over TLS. Yet, OPAQUE (as any other aPAKE) still requires additional mechanisms for negotiating cryptographic parameters (such as crypto algorithms) and for establishing the means needed to encrypt and authenticate communications using the keys generated by OPAQUE. Thus, it is natural to compose OPAQUE with the TLS protocol to offer strong password security while leveraging the standardized negotiation and record-layer security of TLS. Moreover, TLS can offer an initial server-authenticated channel to protect the privacy of account information, such as user name, transmitted between client and server. Here we discuss possible schemes for composing OPAQUE and TLS. We consider TLS 1.3 [47] as the upcoming and more secure version of TLS although some of the mechanisms can be implemented via prior versions of TLS. The simplest TLS-OPAQUE combination is one where U’s private key pU stored by OPAQUE at S is used as a signature key for TLS client authentication. In this case, the OPAQUE-extended handshake protocol includes the following sequential steps (for a total of 5 messages): (i) a 1-RTT run of TLS 1.3 handshake protocol that produces a session key authenticated by S’s TLS certificate; (ii) the first two OPAQUE messages exchanged between client and server excluding the KE values g x , g y (these were already exchanged as part of the TLS 1-RTT run); (iii) TLS 1.3 client authentication using U’s private signature key pU retrieved from S in step (ii). These steps result in mutual authentication where server’s authentication is accomplished based on a TLS certificate. The client can either trust such a certificate or it can verify equality of the certificate’s public key against PS as retrieved by OPAQUE. In case of a mismatch the client can request a signature 50 of S using PS which is computed on the TLS transcript14 . In the latter case, the protocol does not rely on PKI certificates except for protecting account information. In all cases, the security of passwords and password authentication does not rely on PKI but on OPAQUE only. Variants of the above scheme include the use of a TLS 1.3 0-RTT exchange for sending the first OPAQUE message (including protected account information) in which case steps (i) and (ii) are executed concurrently for a total of three messages (flights in TLS jargon) as in regular TLS. This variant, while more efficient, relies on 0-RTT which is available only to clients and servers that have previously shared a key (negotiated in a previous handshake). A 0-RTT variant independent of pre-shared keys and based instead on a server’s public key is possible (e.g., [40]) but it is not standardized by TLS 1.3. Finally, if protecting the secrecy of user’s account information is not considered necessary then steps (i) and (ii) can run concurrently (without using the 0-RTT scheme); in this case server’s authentication is based on OPAQUE’s server key PS . This setting also allows for a more efficient scheme using HMQV as illustrated in Fig. 12 (with additional key derivation and record layer processing based on TLS). We note that the security of the above variants and composition rely on the modularity of OPAQUE that can compose the OPRF steps with arbitrary key-exchange protocols (with KCI and forward security). We remark that the security of TLS 1.3 has been analyzed in multiple works (cf. [20–22, 24, 35, 41]) with client authentication via exported authentication (or “post-handshake authentication”) studied in [39]. 6.3 An OPAQUE variant: Multiplicative blinding A variant of OPAQUE is obtained by replacing the user’s exponential blinding operation α := H 0 (pw)r in DH-OPRF with α := H 0 (pw)·g r . The server responds as before with β = αks . Assuming that U knows the value y = g ks (previously stored or received from S), it can compute the same “hashed Diffie-Hellman” value H 0 (pw)ks as β/y r . The advantage of this variant is that while the number of client exponentiations remains the same, one is fixed-base (g r ) and the other (y r ) can also be fixed-base if U caches y, a realistic possibility for accounts where the user logs in frequently (e.g., a personal email or social network). Computing y r can also be done while waiting for the server’s response to reduce latency. Moreover, both exponentiations can be done offline although only short-term storage is recommended as the leakage of r exposes H 0 (pw) (hence opens pw to a dictionary attack). If U does not store y, it needs to be transmitted to U by S together with the response β. This still allows for fixed-base optimization for computing g r but not for y r . Note that the two OPAQUE variants (with exponential or multiplicative blinding) compute the same value rw, hence an implementation can support both and leave it up to the client to choose one (and request y from S if needed). 14 Such additional server authentication and the client authentication in step (iii) can be implemented using TLS exported authenticators as defined in [51] (client authentication in this case corresponds to post-handshake authentication in [47]). 51 However, it turns out that this multiplicative mechanism results in an OPRF protocol that does not realize our OPRF functionality FOPRF . Thus, our analysis here does not imply the security of the multiplicative OPAQUE variant in general. If rw is redefined as rw := H(pw, y, H 0 (pw)ks ), i.e., if y is included under the hash, then the resulting OPRF does realize our functionality, and OPAQUE remains secure as SaPAKE under both blinding variants. This change, however, introduces a (slight) overhead of having to transmit y even if the client implements the exponential blinding operation. An alternative approach would be to replace the OPRF functionality FOPRF with a 0 0 weaker form FOPRF and to show that (i) FOPRF is realized by the multiplicative 0 variant (even without hashing y) and (ii) FOPRF is sufficient for proving Theorem 2 hence implying the security of OPAQUE as SaPAKE. We intend to investigate this weakening of FOPRF . Acknowledgments. We thank Julia Hesse and Björn Tackmann for their invaluable comments that helped improve our presentation and formal treatment, and to Clemens Hlauschek, Kevin Lewi and Payman Mohassel for helpful discussions. References 1. CFRG, Crypto Forum Research Group, https://datatracker.ietf.org/rg/ cfrg/documents/. 2. Facebook stored hundreds of millions of passwords in plain text, https://www.theverge.com/2019/3/21/18275837/ facebook-plain-text-password-storage-hundreds-millions-users. 3. Google stored some passwords in plain text for fourteen years, https://www.theverge.com/2019/5/21/18634842/ google-passwords-plain-text-g-suite-fourteen-years. 4. M. Abdalla and D. Pointcheval. Simple password-based encrypted key exchange protocols. In Topics in Cryptology – CT-RSA 2005, pages 191–208. Springer, 2005. 5. M. Bellare, C. Namprempre, D. Pointcheval, and M. Semanko. The one-more-RSA- inversion problems and the security of Chaum’s blind signature scheme. Journal of Cryptology, 16(3), 2003. 6. M. Bellare, D. Pointcheval, and P. Rogaway. Authenticated key exchange secure against dictionary attacks. In Advances in Cryptology – EUROCRYPT 2000, pages 139–155. Springer, 2000. 7. S. M. Bellovin and M. Merritt. Encrypted key exchange: Password-based protocols secure against dictionary attacks. In IEEE Computer Society Symposium on Research in Security and Privacy – S&P 1992, pages 72–84. IEEE, 1992. 8. S. M. Bellovin and M. Merritt. Augmented encrypted key exchange: a password- based protocol secure against dictionary attacks and password file compromise. In ACM Conference on Computer and Communications Security – CCS 1993, pages 244–250. ACM, 1993. 9. F. Benhamouda and D. Pointcheval. Verifier-based password-authenticated key exchange: New models and constructions. IACR Cryptology ePrint Archive, 2013:833, 2013. 52 10. A. Biryukov, D. Dinu, and D. Khovratovich. Argon2: new generation of memory- hard functions for password and other applications. In IEEE European Symposium on Security and Privacy – EuroS&P 2016, pages 292–302. IEEE, 2016. 11. X. Boyen. Halting password puzzles. In Usenix Security Symposium – SECURITY 2007, pages 119–134. The USENIX Association, 2007. 12. X. Boyen. HPAKE: Password authentication secure against cross-site user impersonation. In Cryptology and Network Security – CANS 2009, pages 279– 298. Springer, 2009. 13. V. Boyko, P. MacKenzie, and S. Patel. Provably secure password-authenticated key exchange using Diffie-Hellman. In Advances in Cryptology – EUROCRYPT 2000, pages 156–171. Springer, 2000. 14. R. Canetti. Universally composable security: A new paradigm for cryptographic protocols. In IEEE Symposium on Foundations of Computer Science – FOCS 2001, pages 136–145. IEEE, 2001. 15. R. Canetti, S. Halevi, J. Katz, Y. Lindell, and P. MacKenzie. Universally composable password-based key exchange. In Advances in Cryptology – EUROCRYPT 2005, pages 404–421. Springer, 2005. 16. R. Canetti and H. Krawczyk. Analysis of key-exchange protocols and their use for building secure channels. In Advances in Cryptology – EUROCRYPT 2001, pages 453–474. Springer, 2001. 17. R. Canetti and H. Krawczyk. Security analysis of IKE’s signature-based key- exchange protocol. In Advances in Cryptology – CRYPTO 2002, pages 143–161. Springer, 2002. Also Cryptology ePrint Archive, Report 2002/120. 18. R. Canetti and H. Krawczyk. Universally composable notions of key exchange and secure channels. In Advances in Cryptology – EUROCRYPT 2002, pages 337–351. Springer, 2002. 19. D. Cash, E. Kiltz, and V. Shoup. The twin Diffie-Hellman problem and applications. In Advances in Cryptology – EUROCRYPT 2008, pages 127–145. Springer, 2008. 20. C. Cremers, M. Horvat, J. Hoyland, S. Scott, and T. van der Merwe. A comprehensive symbolic analysis of TLS 1.3. In ACM CCS 17, 2017. 21. C. Cremers, M. Horvat, S. Scott, and T. van der Merwe. Automated verification of TLS 1.3: 0-RTT, resumption and delayed authentication. In IEEE S&P 2016., 2016. 22. B. Dowling, M. Fischlin, F. Günther, and D. Stebila. A cryptographic analysis of the TLS 1.3 handshake protocol candidates. In ACM CCS, 2015. Also, Cryptology ePrint Archive, Report 2015/914. 23. P. Farshim, C. Orlandi, and R. Rosie. Security of symmetric primitives under incorrect usage of keys. IACR Transactions on Symmetric Cryptology, 2017(1):449– 473, 2017. 24. M. Fischlin and F. Günther. Replay attacks on zero round-trip time: The case of the TLS 1.3 handshake candidates. In IEEE S&P 2017, 2017. 25. M. J. Freedman, Y. Ishai, B. Pinkas, and O. Reingold. Keyword search and oblivious pseudorandom functions. In Theory of Cryptography – TCC 2005, pages 303–324. Springer, 2005. 26. C. Gentry, P. MacKenzie, and Z. Ramzan. A method for making password-based key exchange resilient to server compromise. In Advances in Cryptology – CRYPTO 2006, pages 142–159. Springer, 2006. 27. L. Gong, M. A. Lomas, R. M. Needham, and J. H. Saltzer. Protecting poorly chosen secrets from guessing attacks. IEEE Journal on Selected Areas in Communications, 11(5):648–656, 1993. 53 28. B. Haase and B. Labrique. AuCPace: Efficient verifier-based PAKE protocol tailored for the IIoT. In CHES, 2019. Cryptology ePrint Archive, Report 2018/286. 29. S. Halevi and H. Krawczyk. Public-key cryptography and password protocols. ACM Transactions on Information and System Security (TISSEC), 2(3):230–268, 1999. 30. S. Jarecki, A. Kiayias, and H. Krawczyk. Round-optimal password-protected secret sharing and T-PAKE in the password-only model. In Advances in Cryptology – ASIACRYPT 2014, pages 233–253. Springer, 2014. 31. S. Jarecki, A. Kiayias, H. Krawczyk, and J. Xu. Highly-efficient and composable password-protected secret sharing (or: how to protect your bitcoin wallet online). In IEEE European Symposium on Security and Privacy – EuroS&P 2016, pages 276–291. IEEE, 2016. 32. S. Jarecki, A. Kiayias, H. Krawczyk, and J. Xu. TOPPSS: Cost-minimal password- protected secret sharing based on threshold OPRF. In Applied Cryptology and Network Security – ACNS 2017, pages 39–58. Springer, 2017. 33. S. Jarecki, H. Krawczyk, and J. Xu. OPAQUE: an asymmetric PAKE protocol secure against pre-computation attacks. In Advances in Cryptology - EUROCRYPT 2018, pages 456–486. Springer, 2018. 34. C. S. Jutla and A. Roy. Smooth NIZK arguments. In Theory of Cryptography – TCC 2018, pages 235–262. Springer, 2018. 35. B. B. K. Bhargavan and N. Kobeissi. Verified models and reference implementations for the TLS 1.3 standard candidate. In IEEE S&P 2017, 2017. 36. B. Kaliski. PKCS# 5: Password-based cryptography specification version 2.0. 2000. 37. H. Krawczyk. SIGMA: The “SIGn-and-MAc” approach to authenticated Diffie- Hellman and its use in the IKE protocols. In CRYPTO, pages 400–425, 2003. 38. H. Krawczyk. HMQV: A high-performance secure Diffie-Hellman protocol (extended abstract). In Advances in Cryptology – CRYPTO 2005, page 546. Springer, 2005. 39. H. Krawczyk. Unilateral-to-mutual authentication compiler for key exchange (with applications to client authentication in TLS 1.3). In ACM CCS 2016, 2016. 40. H. Krawczyk and H. Wee. The OPTLS protocol and TLS 1.3. In EuroS&P, 2016. 41. X. Li, J. Xu, Z. Zhang, D. Feng, , and H. Hu. Multiple handshakes security of TLS 1.3 candidates. In IEEE S&P 2016., 2016. 42. P. Mackenzie. More efficient password-authenticated key exchange. In Topics in Cryptology – CT-RSA 2001, pages 361–377. Springer, 2001. 43. P. MacKenzie, S. Patel, and R. Swaminathan. Password-authenticated key exchange based on RSA. In Advances in Cryptology – ASIACRYPT 2000, pages 599–613. Springer, 2000. 44. C. Percival. Stronger key derivation via sequential memory-hard functions, http: //www.tarsnap.com/scrypt/scrypt.pdf,. 45. D. Pointcheval and G. Wang. VTB-peke: Verifier-based two-basis password exponential key exchange. In ACM Asia Conference on Computer and Communications Security – AsiaCCS 2017, pages 301–312. ACM, 2017. 46. N. Provos and D. Mazieres. A future-adaptable password scheme. In USENIX Annual Technical Conference, FREENIX Track, pages 81–91, 1999. 47. E. Rescorla. The transport layer security (TLS) protocol version 1.3 (draft 24), Feb. 2018. 48. J. Schmidt. Requirements for password-authenticated key agreement (PAKE) schemes. Technical report, 2017. 49. S. Shin and K. Kobara. Augmented password-authenticated key exchange (AugPAKE). draft-irtf-cfrg-augpake-08. 54 50. S. Shin, K. Kobara, and H. Imai. Security proof of AugPAKE. IACR Cryptology ePrint Archive, 2010:334, 2010. 51. N. Sullivan. Exported authenticators in TLS (draft 4), Oct. 2018. A UC OPRF Definition: Discussion of Revisions In Section 3 we showed a definition of the adaptive UC OPRF functionality, FOPRF , shown in Fig. 3. The definition of FOPRF in Fig. 3 diverges from the definition of the same functionality we gave in the proceedings version of this paper [33], and the definition there was itself a revision of the UC OPRF definition given in [31]. Below we first explain the modifications which both the UC OPRF of Fig. 3 and the UC OPRF of [33] share vis-a-vis the UC OPRF version of [31], and then we explain the revisions made by the UC OPRF of Fig. 3 in comparison to the UC OPRF definition in [33]. Changes from OPRF Functionality of [31]. To use UC OPRF in our application(s) we need to make some changes to the way functionality FOPRF was defined in [31], as described below. Changes (3) and (4) are essentially syntactic and require only cosmetic changes in the security argument. Change (2) makes the functionality weaker. Change (1) is the only one which influences the security argument in a more essential way. Fortunately, the DH-OPRF protocol that we use for OPRF instantiation in our protocols, shown in [31] to realize their version of the OPRF functionality FOPRF , also realizes our modified FOPRF functionality. We recall the DH-OPRF protocol in Fig. 13 in Appendix B, adapting its syntax to our changes in FOPRF , and we argue that the security proof of [31] which shows that it realizes FOPRF defined by [31] extends to the modified functionality FOPRF presented here. (1) We extend the OPRF functionality to allow the adaptive compromise of a server holding the PRF key via a Compromise message. Such action is needed in the aPAKE setting where the attacker can compromise a server’s password file that contains an OPRF key. After the compromise, A is allowed to compute that server’s PRF function on any value of its choice using command OfflineEval. We note that functionality FOPRF distinguishes between (statically) corrupted servers and (adaptively) compromised sessions (the latter representing different OPRF keys at the same server), in consistency with the aPAKE functionality from Fig. 2 that distinguishes between an entirely corrupted server and particular aPAKE instances that can be adaptively compromised by an adversary. (2) We eliminate the condition that FOPRF aborts on message (RcvComplete, ssid , i), denoted (RcvComplete, ssid , S∗ ) in [31], if (i) server S is honest, (ii) i 6= S, and (iii) this OPRF sub-session was initialized by U via command (Eval, ssid , S0 , x) for S0 = S, where S is the server which initialized the OPRF instance with session ID sid . The OPRF protocol of [31] makes use of an authenticated channel, hence the user U aborts if the adversary does not relay the messages from the server S if U ran this protocol with S and S is honest. In contrast, we implement UC OPRF without relying 55 on authenticated channels, so such clause must be deleted. This does not affect the security of our Strong aPAKE protocols, which are password-only AKE’s and are thus intended to be used over insecure channels. (3) We change the SndrComplete message so that it is sent from S if S is uncompromised, and from of A only if S is compromised. This allows an honest aPAKE server to enforce a single OPRF execution, and thus e.g. a single password guess per aPAKE sub-session, which is crucial for aPAKE security. (4) We add an Initialization phase to the functionality, which models a server picking an OPRF key. This interface simplifies the usage of OPRF in aPAKE application where teh server picks an OPRF key for each new user. This modeling differs from [31] who framed OPRF initialization as an interactive procedure through an Eval call while here it is performed locally by the server. Syntactic Differences from Adaptive OPRF Functionality in [33]. In the definition of the Adaptive OPRF functionality in Fig. 3 we introduce several modifications to the way this functionality was defined in the proceedings version of this paper [33]. The purpose of these changes is to simplify some functionality interfaces and to clarify some ambiguities in the definition of FOPRF in [33]. In the explanation below we will refer to the OPRF functionality as defined in Fig. 3 as a “revised” functionality FOPRF , and to the same functionality as defined in [33] as a “proceedings-version” functionality FOPRF . The most important difference introduced by the revised functionality FOPRF is that it uses variable i to index random function instances, denoted Fsid,i (x), whereas the proceedings-version of FOPRF used variable S to index these functions, denoted Fsid,S therein.15 Indeed, a sequence of works on UC OPRF [30–33] used the same convention of indexing random functions kept by FOPRF by variable S. However, this notation for function indices has an unfortunate effect of using the same variable to denote data, i.e. an index of a random function, and to denote real-world entities. For example, in the proceedings-version FOPRF the adversary can make an honest client U output Fsid,S∗ (x) for any bitstring S∗ which does not correspond to an honest server S, but the proceedings-version FOPRF models this with SndrComplete message sent by a corrupt entity S∗ to FOPRF , followed by message (RcvComplete, ..., S∗ ) sent by A. This seemingly requires the adversary to create a “virtual corrupt entity” for every function it creates, and is not compliant with UC conventions [14]. We stress that this new notation for the OPRF functionality requires only syntactic changes in the security arguments given for the UC OPRF schemes in [31, 33]. Indeed, the UC OPRF notation proposed above is a better fit for these security arguments.16 15 In Fig. 3 we use notation Fi instead of Fsid,i , but all records of functionality FOPRF are implicitly indexed by the session identifier sid . 16 In particular, the simulators shown for the UC OPRF protocol in [31] treats variables S0 and S∗ as data, not “virtual corrupt entities”, and the messages (SndrComplete, sid , ssid ) pertaining to S0 6= S were sent by the simulator, i.e. the ideal-world adversary, as required by the revised FOPRF syntax. 56 This change of function-indexing variable is reflected in the modified syntax of functionality commands OfflineEval, SndrComplete, and RcvComplete. We explain this new syntax below, and we note that all these commands use variable S to denote the identifier of a unique entity which initializes an OPRF instance with session id sid by sending (Init, sid ) to FOPRF . (1) Off-line evaluation of function Fsid,i on argument x is executed via command (OfflineEval, sid , i, x). This command can be issued by server S, in which case if S is honest then it is allowed only if i = S, since an honest server S evaluates only its own function Fsid,S , while a corrupt server S can evaluate an arbitrary function instance. The adversary A can also issue this command, but if S is not compromised (or corrupted) then A can send it only for i 6= S, since an adversary cannot evaluate function Fsid,S without compromising server S. (2) Command (SndrComplete, sid , ssid ) models the execution of the server- side OPRF protocol using S’s keys, and its effect is to increment counter tx which counts the server-side OPRF executions for the “honest server function” Fsid,S . This command can be issued by S, or by A if S is compromised because S compromise allows A to run the server-side OPRF protocol using S’s keys. A key difference from the proceedings-version FOPRF model is that in the revised FOPRF command SndrComplete is used to model the sender-side OPRF evaluation only if it is executed on the keys held by S, hence (1) the revised FOPRF does not keep counters tx(sid , i) for adversarial functions Fsid,i , i 6= S,17 and (2) neither the adversary nor any “virtual corrupt server” need to send a SndrComplete message to represent the server-side execution of the OPRF protocol on adversarially-chosen keys. (See also item 4 below.) (3) Command (RcvComplete, sid , ssid , i), issued by A, models the adversary letting server-side OPRF messages pass to the party P who runs the user-side OPRF protocol, in which case P outputs Fsid,i (x) for the argument x which P entered via command (Eval, sid , ssid , S0 , x). The adversary can make P output Fsid,i for any i 6= S because in the real-world a network adversary interacting with P can run the server-side OPRF protocol using arbitrary (O)PRF keys. Adversary A can also set i = S, which represents letting the real-world P receive the server-side OPRF messages executed on S’s keys, but then the protocol succeeds only if tx ≥ 0, i.e. if there is an “unclaimed” S’s session which A can pair with P’s session. In this case P outputs Fsid,S (x) and the functionality decrements counter tx. Other Syntactic Differences from UC OPRF of [33] Other syntactic differences between our revised OPRF functionality of Fig. 3 and the proceedings-version OPRF functionality of [33] include the following: (4) The revised FOPRF keeps a counter only for the “honest” function FS , whereas the proceedings-version FOPRF kept it for each adversarial function. We eliminate the counters for adversarially-controlled functions from the revised model because they appear not to play a meaningful role in the 17 In the proceedings-version functionality these counters were indexed by “virtual corrupt server” identities denoted “S” therein. 57 proceedings-version OPRF functionality since the functionality allows the adversary to increment such counters at will via command SndrComplete. (5) The revised FOPRF models offline computation of any adversarial function Fsid,i , for i 6= S, via call OfflineEval, whereas the proceedings-version FOPRF modeled it using a sequence of adversarial calls Eval, SndrComplete, RcvComplete, i.e. as an instance of on-line protocol running “in the head”.18 (6) The revised functionality FOPRF splits the initialization of random function Fsid,S via command Init from offline evaluation of Fsid,S on some argument x via command (OfflineEval, sid , i, x) for i = S, whereas the proceedings- version FOPRF combined initialization with one offline evaluation call. B The DH-OPRF Protocol Realizing Revised FOPRF Fig. 13 shows the DH-OPRF protocol of [31], called 2HashDH therein, syntactically modified to realize the adaptive OPRF functionality FOPRF defined in Fig. 3 in Section 3. Recall that the FOPRF functionality we show in Section 3 is a revision of the (static) OPRF functionality defined in [31], but it is also a revision of the earlier version of the adaptive OPRF functionality which appeared in the conference version of this paper [33]. The protocol shown below is essentially the same as in [31] and requires the same One-More Diffie-Hellman assumption [5, 31] for security. The only differences between 2HashDH in Fig. 13 and in [31] are syntactic: First, we eliminate the user’s “input-output caching” mechanism used in [31]. Second, the protocol in Fig. 13 outputs the OPRF protocol prefix, namely the U-to-S message a, to both U and S instances. As explained in Section 3 outputting these protocol transcript prefixes provides a better “glue” which a higher-level protocol can use to compose OPRF with some other protocol, as protocol OPRF-AKE of Section 5 does by composing OPRF with AKE. Modifications in the Proof of [31]. The proof of Lemma 1 is very similar to the proof of security given in [31], so we only briefly discuss how our modifications to FOPRF influence the security proof. The ideal-world adversary, i.e., simulator SIM, is shown in Fig. 14. Fig. 14 denotes functionality FOPRF as F for brevity, and it makes the following notational assumptions: (1) F’s initialization message (Init, S, sid ) fixes the identifier S and session ID sid for the rest of the simulation, and all messages to and from F and all its internal records are implicitly tagged with sid ; (2) If S is corrupted then SIM acts as if S was compromised from the very beginning; (3) The identifier S of the server for which F sends the initialization message (Init, S, sid ) is encoded as a different binary string than any integer value; (4) There is integer N s.t. the number of hash function H 0 18 For example, the simulator shown in [31] reacts to the real-world adversary’s local computation of hash function H2 (x, v), v 6= H1 (x)k where k is the key of server S, with three messages to the functionality: (Eval, sid , S0 , x) for arbitrary S0 , (SndrComplete, sid , p) for a unique index p associated with an adversarial “public key” yp s.t. (g, yp , H1 (x), v) is a DDH tuple, and (RcvComplete, sid , p). 58 Components: Hash functions H(·, ·), H 0 (·) with ranges {0, 1}` and G, respectively. Functions H, H 0 are specific to the OPRF instance initialized for a unique session ID sid , and they should be implemented by folding sid into their inputs. Initialization: On input (Init, sid ), S picks k ←R Zq and stores (sid , k). Server Compromise: On (Compromise, sid , S) from the adversary, reveal key k. Offline Evaluation – On input (OfflineEval, sid , S, x) for sid matching record (sid , k), S outputs (OfflineEval, sid , y) for y = H(x, (H 0 (x))k ). Online Evaluation – On input (Eval, sid , ssid , S0 , x), U picks r ←R Zq , records (sid , ssid , r), sends (ssid , a) for a = H 0 (x)r to S0 , and outputs (Prefix, ssid , a). – On input (SndrComplete, sid , ssid 0 ) and message (ssid , a) from U s.t. a ∈ G, S retrieves pair (sid , k) with matching sid , aborts if such pair is not found, else sends (ssid , b) for b = ak to U and outputs (Prefix, ssid 0 , a). – On message (ssid , b) s.t. b ∈ G, U retrieves tuple (sid , ssid , r), aborts if tuple not found, else outputs (Eval, sid , ssid , y) for y = H(x, b1/r ). Fig. 13: Adaptive OPRF Protocol DH-OPRF queries made by the real-world adversary A is upper-bounded by N/2, and the number of online OPRF evaluation sub-sessions started by Z via command Eval to some honest user U is also upper-bounded by N/2. Lemma 1. The DH-OPRF protocol shown in Fig. 13 realizes functionality FOPRF of Fig. 3 under the One-More Diffie Hellman assumption in ROM. Using these assumptions, the simulator acts in a similar way as the one shown in [31]: SIM picks a random key k as S does, and uses it by computing b = ak for every incoming message a ∈ G in SndrComplete. SIM embeds a discrete-log trapdoor in every H 0 output, setting H 0 (x) := g r for random r, and recording this choice as hH 0 , x, ri. SIM similarly embeds a discrete-log trapdoor in OPRF messages a sent on behalf of any honest U session (sid , ssid , U), by setting a ← g r for random r, and recording this choice as hssid , U, ri. SIM also keeps track of all Random Function indexes which are evaluted by adversary A either offline, through H queries, or online, through A’s responses b to user U’s message a. Each function is equated with its “public key” z = g k . First, SIM records the honest S’s function this way as hF, 0, k, zi for z = g k , identifying this function with index “0”. Secondly, every time A queries H on new point (x, u), SIM checks if there is record hF, i, ·, z 0 i and hH 0 , x, ri s.t. z 0 = u1/r , because this is equivalent to DL(H 0 (x), u) = DL(g, z 0 ). If this holds for i 6= 0 then A offline evaluates some adversarial function of its choice, hence in that 59 case SIM sends (OfflineEval, sid , i, x) to F and embeds value Fsid,i (x) returned by F into H(x, u). If this holds for i = 0 and S is not compromised then A must be completing some OPRF instance as the user, hence in that case SIM sends (Eval, sid , ssid 0 , ⊥, x) and (RcvComplete, sid , ssid 0 , SIM, S) to F for some fresh ssid 0 value. If F does not return any answer this means that F ticket counter is 0, and that this local computation of Fsid,S (x) by A violated the security properties of FOPRF , in which case SIM halts, and the simulation obviously diverges from the real world execution. Otherwise SIM embeds value Fsid,S (x) returned by F into H(x, u). 1. Pick r1 , . . . , rN ←R Zq . Set g1 := g r1 , . . . , gN := g rN , and J := 1 and I := 1. 2. On (Init, S, sid ) from F pick k ← Zq and record hF, 0, k, z = g k i. 3. On (Compromise, sid ) from A, declare S as compromised, retrieve tuple hF, 0, k, zi, send (Compromise, sid ) to F, and send (sid , k) to A. 4. On A’s fresh query x to H 0 , set H 0 (x) ← gJ , record hH 0 , x, rJ i, and set J++. 5. On (Eval, sid , ssid , U, S0 ) from F, set a ← gJ , respond with prfx = a to F, send (sid , ssid , a) to A as U’s message to S0 , record hssid , U, rJ i, and set J++. 6. On (SndrComplete, sid , S) from F and message (sid , ssid , a) (where a ∈ G) from A sent on behalf of some user to server S, respond with prfx = a to F, retrieve hF, 0, k, zi, and send (sid , ssid , b) for b = ak to A as S’s response. 7. On message (sid , ssid , b) (where b ∈ G) from A sent on behalf of some server to user U, retrieve records hssid , U, ri and hF, i, ·, z 0 i for z 0 = b1/r . If there is no record hF, i, ·, z 0 i, set i := I, record hF, i, ⊥, z 0 i, and set I++. In either case send (RcvComplete, sid , ssid , U, i) to F. 8. On A’s fresh query (x, u) to H, retrieve record hH 0 , x, ri. If there is no such record, then pick H(x, u) ←R {0, 1}` . Otherwise do the following: (1) If record hF, 0, k, zi satisfies that z = u1/r then i. If S is compromised, send (OfflineEval, sid , S, x) to F, and on F’s response (OfflineEval, sid , y) set H(x, u) := y. ii. If S is not compromised, pick a fresh identifier ssid ∗ and send (Eval, sid , ssid ∗ , ⊥, x) and (RcvComplete, sid , ssid ∗ , SIM, S) to F. If F ignores the last message then output halt and abort. Else on F’s response (Eval, sid , ssid ∗ , y) set H(x, u) := y. (2) Else, if there is tuple hF, i, ⊥, u1/r i for i 6= 0 then send (OfflineEval, sid , i, x) to F, and on F’s response (OfflineEval, sid , y) set H(x, u) := y. (3) Else, record hF, i, ⊥, u1/r i for i = I, send (OfflineEval, sid , i, x) to F, and on F’s response (OfflineEval, sid , y) set H(x, u) := y and set I++. Fig. 14: Simulator SIM for Protocol DH-OPRF (with FOPRF denoted as F) Finally, if (x, u) query to H 0 does not match any recorded function hF, i, ·, z 0 i s.t. z 0 = u1/r , then SIM defines a new function hF, i0 , ⊥, z 0 i for fresh index i0 and z 0 = u1/r . SIM interprets A’s OPRF responses b to messages a = g r which SIM sends on behalf of some honest user U in a similar way: Note that if a = g r then z 0 = b1/r = g DL(a,b ). Therefore SIM can identify the function computed by 60 U on this OPRF interaction with the public key z 0 = b1/r . As in the case of responding to H queries, SIM first checks if there exists record hF, i, ·, z 0 i, and otherwise it creates a new record hF, i0 , ⊥, z 0 i for fresh i0 . The only non-syntactic changes in the argument that under OMDH assumption this simulation presents the same view as in the real execution, compared to the simimlar argument given in [31], is that (1) A may at any time compromise server S for a specific sid and learn key k; and (2) that if some user session (sid , ssid u , U) and some server session (sid , ssid s , S) output the same prefixes prfx = a, then this interaction does not increase the ticket counter, and does not count to the pool of OPRF interactions which A can use to compute Fsid,S (x) on some input x. Regarding (1), note that after server compromise A can compute the server’s function on any argument, but SIM can detect that by catching H query on (x, v) for v = (H 0 (x))k , and can simulate this by sending (OfflineEval, sid , S, x) to FOPRF . Furthermore, note that event halt may only occur if server S is not marked compromised at that time; hence the argument which upper-bounds Pr[halt] given in [31] is not affected by this change because it assumes that S is not compromised at the time. Regarding (2), it is easy to see that if A who forwards to some server session the message a = g r sent by SIM on behalf of some honest user U session, the security reduction can respond to such message with b = z r , where z is the OMDH challenge public key z = g k . Therefore the reduction will not need to query the One-More Diffie-Hellman oracle (·)k on all such S sessions, and thus these sessions will not increase the number of arguments x on which adversary A can compute u = (H 0 (x))k , by querying H on (x, u), without breaking the OMDH assumption. 61