Real-time pricing (RTP) of electricity for consumers has long been argued to be crucial for realizing the many envisioned benefits of demand flexibility in a smart grid. However, many details of how to actually implement a RTP scheme are still under debate. Since most of the organized wholesale electricity markets in the US implement a two-settlement mechanism, with day-ahead electricity price forecasts guiding financial and physical transactions in the next day and real-time ex post prices settling any real-time imbalances, it is a natural idea to let consumers respond to the day-ahead prices in real-time. However, if such an idea is not controlled properly, the inherent closed-loop operation may lead consumers to all respond in the same fashion, causing large swings of real-time demand and prices, which may jeopardize system stability and increase consumers’ financial risks.
To overcome the potential uncertainties and undesired demand peak caused by 'selfish' behaviors by individual consumers under RTP, in this research, we develop a fully decentralized price-driven demand response (DR) approach under game-theoretical frameworks. In game theory, agents usually make decisions based on their belief about competitors' states, which needs to maintain a large amount of knowledge and thus can be intractable and implausible for a large population. Instead, we propose using regret-based learning in games by focusing on each agent's own history and utility received. We study two learning mechanisms: bandit learning with incomplete information feedback, and low regret learning with full information feedback. With the learning in games, we establish performance guarantees for each individual agent (i.e., regret minimization) and the overall system (i.e., bounds on price of anarchy).
In addition to the game-theoretical framework for price-driven demand response, we also apply such a framework for peer-to-peer energy trading auctions. The market-based approach can better incentivize the development of distributed energy resources (DERs) on demand side. However, the complexity of double-sided auctions in an energy market and agents’ bounded rationality may invalidate many well-established theories in auction design, and consequently, hinder market development. To address these issues, we propose an automated bidding framework based on multi-armed bandit learning through repeated auctions, and is aimed to minimize each bidder’s cumulative regret. We also use such a framework to compare market outcomes of three different auction designs.